marirazno Goto Github PK
Name: Maria
Type: User
Company: NeonAI
Bio: NLP Engineer
Location: Kharkiv, Ukraine
Name: Maria
Type: User
Company: NeonAI
Bio: NLP Engineer
Location: Kharkiv, Ukraine
Multiclass text classifier based on BERT architecture.
BERT service implementation
This repository contains script that extracts ICD codes and diagnoses from medical reports
We say "make a mistake", but "do a favour"; we say "big surprise", but "great anger"; we say "highly unlikely", but "seriously wrong". Words collocate in interesting and unpredictable ways. Moreover, word collocations can tell us more about the meaning of the word. Your task is to research how verbs from the same synset collocate with adverbs. For example, we usually "love somebody dearly", "honor somebody highly", and "admire somebody greatly". The task: collect more synonyms for this synset: "say", "tell", "speak", "claim", "communicate" write a function that finds a verb from the synset in the sentence and collects all adverbs that this verb governs; consider only adverbs that end with "-ly" write a program that collects all verbs and their adverbs in the blog corpus the output of the program should be ten most frequent adverbs that collocate with the verb
Scraping text data for analysis from the web-sites
NaiveBayesClassifier
2-gramm based song generator
The Associated Press Stylebook is a style guide widely used among American journalists. It enforces the following rules for capitalization of news headlines: Capitalize words with 4 or more letters. Capitalize the first and the last word in the headline. Capitalize nouns, pronouns, adjectives, verbs, adverbs, numerals, and subordinating conjunctions. Lowercase all other parts of speech: articles, coordinating conjunctions, prepositions, particles, interjections.
Highlighting terms (nouns and predicates) and thematic modeling using SpaCy (for Russian). Calculating TF-IDF for the relevant terms extraction (sk-learn)
Predicting the star rate to the users` comments according using Supervised ML algorithm.
Analization of ukrainian and russian texts using Pymorphy
Read about Gematria, a method for assigning numbers to words and for mapping between words having the same number (http://en.wikipedia.org/wiki/Gematria). There are different views on how to count Gematria. Your script will incorporate two different scores. Write a function count_gematria(word, option) that sums the numerical values of the letters of a word using letter_values_1 if option is 1 and letter_values_2 if option is 2:
Write a function real_zen(input_file) that reads zen.txt as input_file and prints "The Zen of Python" in the following format: the title, "by" + the author and then the Zen itself line by line, starting with the line number. You should ignore the comments: The Zen of Python by Tim Peters 1. Beautiful is better than ugly. 2. Explicit is better than implicit. ... 19. Namespaces are one honking great idea -- let's do more of those! Your function should print 2 lines with the title and the author and then 19 more lines with the wisdom about Python, starting with the numbers from 1 to 19. Read all necessary information from the file.
SpaCy Part-of-Speech tagging model that can identify "Profiles", "Categories", "Goals", "Measures", "Actions" from text data. The grammar-based rules using POS tagging and dependency parsing upon for better accuracy.
Russian stopwords collection
symantic_similarity, Reznik similarity
1)Text segmentation, 2)Tokenization, 3)Building concordance, 4)Steming, 5)Lematization
Finding the most popular n-gramms based on corpus words or corpus sentences
Calculating word frequences using NLTK methods
Word distribution in NLTK gutenberg
Giving all the information about word from WordNet
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.