+ Reply to Thread
Results 1 to 2 of 2

Thread: INI files content

 
  1. #1
    Contributing User
    Join Date
    May 2011
    Posts
    166
    Rep Power
    200

    Default INI files content

    Hello

    I don't know if anyone has noticed, but the ini files used by Trados have got each ASCII character padded with an null byte:

    pabloa:~/Development$ hexdump -C file.ini
    00000000 ff fe 0d 00 0a 00 5b 00 47 00 65 00 6e 00 65 00 |......[.G.e.n.e.|
    00000010 72 00 61 00 6c 00 53 00 47 00 4d 00 4c 00 53 00 |r.a.l.S.G.M.L.S.|
    00000020 65 00 74 00 74 00 69 00 6e 00 67 00 73 00 5d 00 |e.t.t.i.n.g.s.].|
    00000030 0d 00 0a 00 43 00 61 00 73 00 65 00 53 00 65 00 |....C.a.s.e.S.e.|
    00000040 6e 00 73 00 69 00 74 00 69 00 76 00 65 00 3d 00 |n.s.i.t.i.v.e.=.|
    00000050 59 00 65 00 73 00 0d 00 0a 00 44 00 54 00 44 00 |Y.e.s.....D.T.D.|
    [etc ...]

    This makes it very difficult to manipulate with standard unix tools. For the time being I've managed to find out how to filter it so I can use "grep" on substrings I might be interested on. First I created a very simple sed script:
    pabloa:~/Development$ cat -A cleanup_ini.sed
    s/^@//g

    Here ^@ is the null character. Then I can do things like:

    pabloa:~/Development$ sed -f cleanup_ini.sed file.ini | grep Group
    Tag1=type:External,Group
    Tag2=native_title_lang:External,Group
    Tag12=type_of_place:External,Group
    Tag50=ddc:External,Group

    With lots of good luck someone might find this useful.

    Cheers.
    P.
    Last edited by pabloa; 06-13-2011 at 01:07 PM.

  2. #2
    Contributing User
    Join Date
    May 2011
    Posts
    166
    Rep Power
    200

    Default Re: INI files content

    I've found out why this is so. The files are encoded in UTF-16 where each character is two bytes. The method suggested above works fine as long as there are no characters outside the ASCII range. Otherwise it breaks the encoding. The correct way of dealing with this situation is to convert the file into UTF-8 with the following command:

    Code:
    pabloa:~/Development$ iconv -f utf16 -t utf8 file.ini > file-utf8.ini
    Cheers.
    P.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Bidi content
    By gentle in forum Other Languages Translation
    Replies: 53
    Last Post: 06-07-2017, 03:44 PM
  2. English content-Contenu ou Contenus?
    By nabylm in forum French Translation
    Replies: 3
    Last Post: 12-01-2016, 12:33 PM
  3. Replies: 0
    Last Post: 08-17-2011, 11:38 AM
  4. Internet Content per Language?
    By IUS in forum Miscellaneous
    Replies: 2
    Last Post: 03-03-2011, 11:26 AM
  5. Replies: 1
    Last Post: 02-12-2007, 03:16 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •