prenastro Goto Github PK
Name: Preranathm
Type: User
Bio: Software Developer
Location: Los Angeles
Name: Preranathm
Type: User
Bio: Software Developer
Location: Los Angeles
Developed a simple web crawler to measure aspects of a crawl, study the characteristics of the crawl, download web pages from the crawl and gather webpage metadata of C-Span website
Create word clouds in JavaScript.
CSCI 585 Assignments. 1. EER Diagram for E-Learn 2. SQL 3. KML - Nearest Neighbors and Convex Hull code 4. Tinkerpop Gremlin 5. Weka, Rapid Miner, Knime tools execution.
Using AlexNet CNN to classify images into one of the classes defined in caffe_classes.py. Images with similar classes can be grouped together and used for Image Similarity Search. To test the model please run testModel.py
Deep Learning based Sentiment Ranking for Multimedia
project related
Image similarity and search application
ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
Models, and associated helper code for GSOC 2017 project Tensorflow Image to Text in Apache Tika
Created an Inverted Index of words occurring in a set of web pages using a subset of 74 files from a total of 408 files (text extracted from HTML tags) derived from the Stanford WebBase project (https://ebiquity.umbc.edu/resource/html/id/351). Placed these files in a bucket on Google cloud storage and ran a Hadoop job to read inputs from this bucket.
Docker container with node, npm, bower, yeoman and grunt packages.
An OsgViewer with support for the Oculus Rift
LDA Topic Modeling for Polar Data Insights
Conceptual - Temporal - Spatial analysis of the trec polar dataset
Polar USC activities related to NSF Polar CyberInfrastructure program at the University of Southern California
This code gets connected to Solr DB created for Sparkler Crawled Data to do further data extraction, classification, filtering and insights generation using various Machine Learning models. The ML models are capable of using keywords list from user, extract features from URL content, and classify (score) output and update Solr parameter accordingly. Apache Sparkler Link: https://github.com/USCDataScience/sparkler
Polling App on WindowsPhone OS. Used for Survey purposes. Allows users to post their own questions and also vote for their favourite option for questions posted by others.
Adding Spell Checking, AutoComplete and Snippets functionality to Solr Search Engine. Enhanced Solr program with spelling correction and an autocomplete (suggest) function. Also used an external spelling correction program called Norvig’s spell correction program in conjunction with Solr, to enhance the autocomplete functionality of Solr. Norvig’s spell correction program uses a text file(‘’big.txt”) to get set of words to calculate edit distance. Here I am using Apache Tika for this purpose.
Imported a set of pages on Apache Solr and analyzed different ranking Algorithms like Lucene and PageRank. Using Solr to index documents, Tika and TagSoup library to extract text from any kind of HTML found on web. Developed a PHP client which accepts input from the user in HTML form, and sends request to the Solr server. Solr server processes the query and returns results which are parsed by the PHP program and displayed. Changing the ranking algorithm in Solr to PageRank. The app loops through each fetched webpage and extracts outgoing links. Using a mapping file which has web pages mapping to actual urls, filter out the urls not present in the file. Create a network graph with web pages as vertices and links representing an edge between two files using NetworkX Library. Search for a list of keywords and compare the two Algorithms.
Spark-Crawler : Evolving Apache Nutch to run on Spark.
Fork of APACHE TIKA - Specific Customizations for textual content extraction and enrichment
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
USC Information Retrieval and Data Science Group
Allows Faculty to set their own slots and manage their assigned slots for the workshops for various courses in college.
XML_Parser
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.