Login
User Name:

Password:



Register
Forgot your password?
Vote for Us!
Couple bugs
Dec 12, 2017, 5:42 pm
By Remcon
Bug in disarm( )
Nov 12, 2017, 6:54 pm
By GatewaySysop
Bug in will_fall( )
Oct 23, 2017, 1:35 am
By GatewaySysop
Bug in do_zap( ), do_brandish( )
Oct 18, 2017, 1:52 pm
By GatewaySysop
Bug in get_exp_worth( )
Oct 10, 2017, 1:26 am
By GatewaySysop
LOP 1.45
Author: Remcon
Submitted by: Remcon
LOP Heroes Edition
Author: Vladaar
Submitted by: Vladaar
Heroes sound extras
Author: Vladaar
Submitted by: Vladaar
6Dragons 4.3
Author: Vladaar
Submitted by: Vladaar
Memwatch
Author: Johan Lindh
Submitted by: Vladaar
Users Online
CommonCrawl, Yandex, DotBot, Bing

Members: 0
Guests: 11
Stats
Files
Topics
Posts
Members
Newest Member
477
3,705
19,232
608
LAntorcha
Today's Birthdays
There are no member birthdays today.
Related Links
» SmaugMuds.org » Codebases » SmaugFUSS » accents (again)
Forum Rules | Mark all | Recent Posts

accents (again)
< Newer Topic :: Older Topic > utf-8 and others...

Pages:<< prev 1 next >>
Post is unread #1 Dec 6, 2014, 1:31 am
Go to the top of the page
Go to the bottom of the page

Matteo2303
Apprentice
GroupMembers
Posts57
JoinedAug 25, 2003

I have read in the past that has been discussed on accents but my need is a little different.
I want rest in ASCII context but I'd like accept àèìòù user input and convert this:
à->a' è->e' ì->i' ò->o' and ù->u'

Apparently it's simple with something like this:
else if ( isascii(d->inbuf[i]) && isprint(d->inbuf[i]) )
d->incomm[k++] = d->inbuf[i];
else
switch ( d->inbuf[i] )
{ case 'à': d->incomm[k++] = 'a'; d->incomm[k++] = '\''; break; /* ... CUT .... */ }


This generally work if you use client that send data in ISO format (one byte char), but with client that use utf-8 or other charset no.

From my machine, using different client and OS, if i press "à" I recived different results:

TELNET: à -> "à"
           letter 0xffffff85 is number -123

ZMUD and TINTIN++: à-> "à"
           letter 0xffffffe0 is number -32

PUTTY: à -> "Ã "
           letter 0xffffffc3 is number -61
           letter 0xffffffa0 is number -96
(note, two char in this case)


Is there a way to interpret (undeclared) user input charset?
       
Post is unread #2 Dec 8, 2014, 12:41 am
Go to the top of the page
Go to the bottom of the page

Quixadhal
Conjurer
GroupMembers
Posts398
JoinedMar 8, 2005

No, there is not. Welcome to the joys of language.

The best you can hope for is to implement a proper TELNET stack and send the appropriate sequence to ask the client what character set encoding it is using, and hope it gives a valid answer. Otherwise, you have to punt and just take your best guess.

Unicode is more and more likely to be used these days, but there are also several latin codepages which are quite common.
       
Post is unread #3 Dec 8, 2014, 7:14 am
Go to the top of the page
Go to the bottom of the page

Matteo2303
Apprentice
GroupMembers
Posts57
JoinedAug 25, 2003

I solved mapping user input at low level.
Something like: "press òèìòù sequence pls".

I had try the telent escape sequence for charset but one client on 50 respond well.

than you!
mat
       
Pages:<< prev 1 next >>