Giter Club home page Giter Club logo

parsingtutorial's Introduction

Your first Java 8/Maven tutorial to become a big data engineer

To whom is this tutorial

This project was my first tutorial in my journey to become a big data engineer. I learnt a lot doing it, so I thought why not sharing it with others who also want to become a big data developer.

To do this tutorial you should have the basics in programming (basic types, loop, etc.) and in object-oriented programming in Java (class, methods). If not I recommend you to check those resources out:

This tutorial will not show you how to write every single line of the project but rather lead you to write your own version of the project throughout a structured list of instructions/questions. The best way to learn is by doing :)

If you find any error, or have any suggestion, don't hesitate to drop me a message or a pull request :)

What you will learn

  • What is maven
  • Discover/use IntelliJ
  • Parse a JSON and a XML file
  • Apply some stats on the data parsed from those files
  • Use Stream APIs and see how they can avoid you to use big and ugly loops

Setting up

Install Java

  • Which version did you choose? Why?
  • What is the difference between the JDK and the JRE?

http://www.oracle.com/technetwork/java/javase/downloads/index.html

Some other Java tutorials:

Install Maven

Some Maven tutorials :

Install an IDE: IntelliJ

  • What is an IDE?
  • How to configure a proxy on IntelliJ?
  • How to install plugins? Install Scala plugin (for the next tutorial)
  • https://www.jetbrains.com/idea/

You can also use Eclipse, but IntelliJ is the most advanced Java IDE in my honest opinion.

Install a version control system: Git

Part 1

Take the JSON file: liste_noms_age.json

The purpose of the program will be to read the JSON file and display an answer to the following questions:

  • Who is the youngest? Who is the oldest?
  • What is the longest name? The shortest?
  • What is the average age?
  • What is the greatest age difference between two successive people?

Tips:

  • Write a Java class with a function that takes as input a path to a file and returns the content of the file in a String (usable in part 2)
  • Write a Java class with a function that takes an entry from a String (which is actually a JSON) and parses the JSON to make "readable and exploitable" objects.
  • Write a class with a method that performs the treatment(s)
  • Preferably use JsonPath (it is possible to use another easy-to-use parser)
  • Learn more about the JsonPath API, Sample doc: https://github.com/json-path/JsonPath

Part 2

Do the same with the XML file: liste_nom_ages.xml (from https://github.com/knoel99/ParsingTutorial/tree/master/src/main/resources)

Hey hey, the ages have changed or it's not funny :D

Tricks:

Part 3

Now, you have to:

  • Add some external configuration (i.e. the values that were hard-coded in the java code must be in an external file, usually named application.properties or application.conf that must be placed in src/main/resources, and are called from the program)
  • Add loggers (it's prettier than println), so you have to inquire about Slf4j and Log4j.

parsingtutorial's People

Contributors

knoel99 avatar dependabot[bot] avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.