Giter Club home page Giter Club logo

node-iconv-jp's Introduction

node-iconv-jp

Text recoding in JavaScript for fun and profit!

Installing with npm

npm install iconv-jp

Note that the npm-ified version of node-iconv-jp only works with node.js >= v0.3.0.

Cloning the repository

If you are developing against node.js v0.3.0 or later:

git clone [email protected]:kazuhisya/node-iconv-jp.git

If you are developing against node.js v0.2.x:

git clone -b v0.2.x git://github.com/bnoordhuis/node-iconv.git

v0.2.x support is slowly being phased out but it will receive bug fixes for the foreseeable future.

Compiling

To compile and install the module, type:

make install NODE_PATH=/path/to/nodejs

NODE_PATH will default to /usr/local if omitted.

Note that you do not need to have a copy of libiconv installed to use this module.

Usage

Encode from one character encoding to another:

// convert from UTF-8 to ISO-8859-1
var Buffer = require('buffer').Buffer;
var Iconv  = require('iconv-jp').Iconv;
var assert = require('assert');

var iconv = new Iconv('UTF-8', 'ISO-8859-1');
var buffer = iconv.convert('Hello, world!');
var buffer2 = iconv.convert(new Buffer('Hello, world!'));
assert.equals(buffer.inspect(), buffer2.inspect());
// do something useful with the buffers

Look at test.js for more examples and node-iconv-jp's behaviour under error conditions.

Notes

Things to keep in mind when you work with node-iconv-jp.

Chunked data

Say you are reading data in chunks from a HTTP stream. The logical input is a single document (the full POST request data) but the physical input will be spread over several buffers (the request chunks).

You must accumulate the small buffers into a single large buffer before performing the conversion. If you don't, you will get unexpected results with multi-byte and stateful character sets like UTF-8 and ISO-2022-JP.

node-buffertools lets you concatenate buffers painlessly. See the description of buffertools.concat() for details.

Dealing with untranslatable characters

Characters are not always translatable to another encoding. The UTF-8 string "ça va が", for example, cannot be represented in plain 7-bits ASCII without some loss of fidelity.

By default, node-iconv-jp throws EILSEQ when untranslatabe characters are encountered but this can be customized. Quoting the iconv_open(3) man page:

//TRANSLIT
	When  the  string  "//TRANSLIT"  is appended to tocode, transliteration is activated.
	This means that when a character cannot be represented in the target character set,
	it can be approximated through one or several similarly looking characters.

//IGNORE
	When the string "//IGNORE" is appended to tocode, characters that cannot be represented
	in the target character set will be silently discarded.

Example usage:

var iconv = new Iconv('UTF-8', 'ASCII');
iconv.convert('ça va'); // throws EILSEQ

var iconv = new Iconv('UTF-8', 'ASCII//IGNORE');
iconv.convert('ça va'); // returns "a va"

var iconv = new Iconv('UTF-8', 'ASCII//TRANSLIT');
iconv.convert('ça va'); // "ca va"

var iconv = new Iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE');
iconv.convert('ça va が'); // "ca va "

EINVAL

EINVAL is raised when the input ends in a partial character sequence. This is a feature, not a bug.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.