Giter Club home page Giter Club logo

hieroglyph's Introduction

GitHub Workflow

Hieroglyph

Tools for working with characters in Scala

Hieroglyph provides facilities for working with characters, in particular, by providing typesafe support for different character encodings, and utilizing additional Unicode metadata.

Features

  • specification of character encodings
  • warnings issued if using an encoding which may not be available
  • provides additional character metadata from the Unicode database
  • facilitates accurate text width calculations, particularly for East Asian scripts

Availability Plan

Hieroglyph has not yet been published. The medium-term plan is to build Hieroglyph with Fury and to publish it as a source build on Vent. This will enable ordinary users to write and build software which depends on Hieroglyph.

Subsequently, Hieroglyph will also be made available as a binary in the Maven Central repository. This will enable users of other build tools to use it.

For the overeager, curious and impatient, see building.

Getting Started

Character Display Width

Hieroglyph provides an extension method, displayWidth, on Char which will return the amount of space the glyph for that character will take up, when rendered in a mono-spaced font.

Unsurprisingly, this will usually be 1, but many characters in alphabets that are not based on the Latin Alphabet will need two normal character widths of space when rendered.

For example, compare,

'x'.displayWidth   // returns 1
'好'.displayWidth  // returns 2

However, calculating the width of a character (and, in particular a string of characters) will be much slower if every character must be checked individually, and totalled, when the length field of a string can provide the same value in constant (and fast) time, for strings which are known not to contain any "wide" characters.

Gossamer provides a corresponding displayWidth extension method on all text-like types, which calculates the display width of the entire string by summing its character widths, or, with textMetrics.uniform in scope, simply returns the length value.

Therefore, methods which need to perform text width calculations can use either a uniform mode or an eastAsianScripts mode, depending on the contextual value imported from the textMetrics package.

Status

Hieroglyph is classified as fledgling. For reference, Soundness projects are categorized into one of the following five stability levels:

  • embryonic: for experimental or demonstrative purposes only, without any guarantees of longevity
  • fledgling: of proven utility, seeking contributions, but liable to significant redesigns
  • maturescent: major design decisions broady settled, seeking probatory adoption and refinement
  • dependable: production-ready, subject to controlled ongoing maintenance and enhancement; tagged as version 1.0.0 or later
  • adamantine: proven, reliable and production-ready, with no further breaking changes ever anticipated

Projects at any stability level, even embryonic projects, can still be used, as long as caution is taken to avoid a mismatch between the project's stability level and the required stability and maintainability of your own project.

Hieroglyph is designed to be small. Its entire source code currently consists of 319 lines of code.

Building

Hieroglyph will ultimately be built by Fury, when it is published. In the meantime, two possibilities are offered, however they are acknowledged to be fragile, inadequately tested, and unsuitable for anything more than experimentation. They are provided only for the necessity of providing some answer to the question, "how can I try Hieroglyph?".

  1. Copy the sources into your own project

    Read the fury file in the repository root to understand Hieroglyph's build structure, dependencies and source location; the file format should be short and quite intuitive. Copy the sources into a source directory in your own project, then repeat (recursively) for each of the dependencies.

    The sources are compiled against the latest nightly release of Scala 3. There should be no problem to compile the project together with all of its dependencies in a single compilation.

  2. Build with Wrath

    Wrath is a bootstrapping script for building Hieroglyph and other projects in the absence of a fully-featured build tool. It is designed to read the fury file in the project directory, and produce a collection of JAR files which can be added to a classpath, by compiling the project and all of its dependencies, including the Scala compiler itself.

    Download the latest version of wrath, make it executable, and add it to your path, for example by copying it to /usr/local/bin/.

    Clone this repository inside an empty directory, so that the build can safely make clones of repositories it depends on as peers of hieroglyph. Run wrath -F in the repository root. This will download and compile the latest version of Scala, as well as all of Hieroglyph's dependencies.

    If the build was successful, the compiled JAR files can be found in the .wrath/dist directory.

Contributing

Contributors to Hieroglyph are welcome and encouraged. New contributors may like to look for issues marked beginner.

We suggest that all contributors read the Contributing Guide to make the process of contributing to Hieroglyph easier.

Please do not contact project maintainers privately with questions unless there is a good reason to keep them private. While it can be tempting to repsond to such questions, private answers cannot be shared with a wider audience, and it can result in duplication of effort.

Author

Hieroglyph was designed and developed by Jon Pretty, and commercial support and training on all aspects of Scala 3 is available from Propensive OÜ.

Name

Hieroglyphics are elaborate characters, whose meaning requires interpretation, while Hieroglyph is a library which provides encodings to translate between characters and their binary representations.

In general, Soundness project names are always chosen with some rationale, however it is usually frivolous. Each name is chosen for more for its uniqueness and intrigue than its concision or catchiness, and there is no bias towards names with positive or "nice" meanings—since many of the libraries perform some quite unpleasant tasks.

Names should be English words, though many are obscure or archaic, and it should be noted how willingly English adopts foreign words. Names are generally of Greek or Latin origin, and have often arrived in English via a romance language.

Logo

The logo is a stylised rendering of the Unicode logo.

License

Hieroglyph is copyright © 2024 Jon Pretty & Propensive OÜ, and is made available under the Apache 2.0 License.

hieroglyph's People

Contributors

propensive avatar

Stargazers

 avatar

Watchers

 avatar  avatar

hieroglyph's Issues

Check whether buffers could be allocated more efficiently

Calling decode allocates at least 4096 bytes and chars as working space for the decoding. These are not currently reused, except through the JVM, which only does so by GC. As long as the buffers are capture-checked, it ought to be possible to pool and reuse the same ones multiple times.

Provide subscript/superscript partial functions

These are defined and redefined in a few Scala One projects, but they belong in Lithography. They should be provided as PartialFunctions so that isDefinedAt can be used to determine if a particular character is convertible.

Make sure decoding is finished

The decoder.decode method needs to be called with true before decoding is considered "finished". This is probably relevant for identifying incomplete byte sequences at the end of input as errors, but this hasn't been tested. Most likely, the error goes undetected and the final character is skipped.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.