Giter Club home page Giter Club logo

rbstata's Introduction

rbStata

rbStata is a CLI utility to easily convert between (or, roll back) versions of Stata's .dta, which are not forward compatible.

  • Cross-platform CLI utility: Windows, Mac, Linux.
  • No knowledge of Python required (but requires a Python installation).
  • Handles Unicode to ASCII transliteration (older versions of Stata do not support Unicode).
  • Works with Python 3.6+.
  • Handles transferring (where available and possible) of:
    • Variable labels
    • Data labels
    • Value labels
Package PyPI - Python Version PyPI GitHub release (latest by date)
Testing/build Coverage CI CLI Pkg-CLI DocLinks
Meta DOI GitHub

Statement of Need

Stata .dta data files are not forward compatible. This means you cannot use older versions (e.g., Stata 13) to read a .dta file exported from newer versions (e.g., Stata 17).

So what is one to do when you try to open a dta file in Stata and get a rude dta too modern r(601) error:

...

Roll back to older versions of Stata .dta files with rbStata. rbStata is a quick and dead simple CLI (command-line interface) to go way back with Stata data (.dta) files. You do not need access to newer Stata versions.

Quick usage

  • Simple single-line command-line usage:

    • Convert the auto.dta file so that you can open it in Stata 13

      $ rbstata auto.dta --target-version 13 --verbose
    • Convert all dta files in the path so that you can open it in Stata 13

      $ rbstata --all --target-version 13 --verbose
  • Let rbStata prompt you for relevant settings:

    Type rbstata and enter settings (press enter to accept default settings in brackets):

    $ rbstata
    -------------------------------------------------------------
    Welcome to the rbStata quickstart command-line utility.
    
    You will be prompted for relevant settings.
    
    Please enter values under the following settings.
    (just press Enter to accept the default value in brackets)
    -------------------------------------------------------------
    
    Enter the dta file(s) you want to convert (e.g. ''auto.dta'').
    (It is not necessary to key in the .dta extension (e.g., just type ''auto'').
    Press Enter to include all .dta files in the current directory.)
    > .dta file(s) [*]:
    ...
    
    The Stata version to convert to.
    > Target version [13]:
    ...
    
    File suffix for saving the output file(s).
    (For example, the suffix ''-old'' means that auto.dta will be converted and
    saved as auto-old.dta. Default is to use ''-rbstata''.)
    > File suffix for saving [-rbstata]:
    ...
    
    Include all .dta files in current directory and its subdirectories.
    (Default is to include only the .dta files in the current directory.)
    > Include subdirectories (y/n) [n]:
    ...
    
    > Print all messages (y/n) [y]:
    ...
Settings [defaults]
  • .dta file(s) [*]: .dta files to convert [all .dta files in current directory]
  • Target version [13]: version to convert to [Stata v13]
  • File suffix for saving [-rbstata]: Suffix for saving [E.g. save auto.dta to auto-rbstata.dta]
  • Include subdirectories (y/n) [n]: Include subdirectories if * [no]
  • Print all messages (y/n) [y]: Print all messages and errors [yes]

More about the problem

Assortment of enquires about the error
  • [1] Stata support FAQs: How can I save a Stata dataset so that it can be read by a previous version of Stata?
  • [2] how to read a stata 15 data file in stata 13.
  • [3] How to open stata 14 files in Stata 12-13.
  • [4] How to open a new stata dataset version.
  • [5] How to open a file that is more from a more recent version of Stata into Stata13.

Versions and string handling

One major jump in forward compatibility is from Stata 13 to Stata 14, where Stata 14 started adding Unicode compatibility. rbStata handles transferring of the data, value, and variable labels. If Unicode in labels exist and the backward target version is 13, rbStata will transliterate Unicode to ASCII and truncate labels to 80 characters.

Assortment of enquires about the error
  • [1] Stata support FAQs: How can I save a Stata dataset so that it can be read by a previous version of Stata?
  • [2] how to read a stata 15 data file in stata 13.
  • [3] How to open stata 14 files in Stata 12-13.
  • [4] How to open a new stata dataset version.
  • [5] How to open a file that is more from a more recent version of Stata into Stata13.

Alternative solutions

Based on proposed solutions in More about the problem.

  • [0] Use Stata's saveold (but for this you first need access to the new Stata version. Read in the dta file. Save it using saveold. Then use the converted dta file in your older Stata version).
  • [1] Stat/Transfer (proprietary).
  • [2] R's Haven.

About this utility

rbStata is an open source utility that wraps around click and pandas's DataFrame.to_Stata utility. Using rbStata, easily convert new Stata dta files to older versions.

Expose CLI help reference
$ rbstata -h
Usage: rbstata [OPTIONS] <dta files>

  Find your way back to older versions of dta files.

  Convert newer Stata .dta files to older versions so that you can open them in older
  Stata versions.

Options:
  -a, --all            Convert all dta files in path.
  -v, --version <int>  Which version of Stata to convert to.
  -s, --suffix <text>  Suffix to be added to converted file.
  -o, --output <text>  Name of converted .dta file (Single file conversion only).
                       Supercedes [suffix].
  -r, --recursive      Convert all .dta files in directory and subdirectories.
  -w, --overwrite      Over[w]rite original input .dta files.
  -ve, --verbose       Print messages.
  -h, --help           Show this message and exit.

rbstata's People

Contributors

lsys avatar

Stargazers

 avatar

Watchers

 avatar  avatar

rbstata's Issues

API design - change order of inputs

.dta file(s) [*]: .dta files to convert [all .dta files in current directory]
Include subdirectories (y/n) [n]: Include subdirectories if * [no]
Target version [13]: version to convert to [Stata v13]
File suffix for saving [-v13]: Suffix for saving [E.g. save auto.dta to auto-v13.dta]
Print messages (y/n) [y]: Print all messages and errors [yes]


seems like target option is more important than include directories, yet the former comes after the latter.

Cannot convert nickchk causaldata

Cannot use wbStata on nickchk's causal data

Got a

raise ValueError("Variable labels must be 80 characters or fewer")
ValueError: Variable labels must be 80 characters or fewer

Can try using a notebook to investigate

Add to features in readme

  • smth smth takes care of unicode compatibility issues
  • takes cares of transferring variable labels, value labels, data labels, if any

Refactor messages

Put boilerplate messages into separate variables/file?

Can be shared by both main and test file

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.