Hi all!
Our ideathon is coming up! Here are some resources that we can use for our brainstorming-session on October 7:
CHEATSHEETS
Python for Data Science
Pandas
NumPy
Scikit-Learn
Keras
OPEN DATA
Kaggle
Open ML (19629 datasets)
日本のオープンデータ (Japanese open data)
Data.world (US)
UCI Machine Learning Repository (394 datasets, classics in ML)
API, libraries and stuff
Natural Language Processing
NLTK (good for pre-processing), NLTK’s Japanese morphological analyzer MeCab, SpaCy (industrial strength for English, German and French), Gensim (topic modeling), …
Get text data:
Twitter: Twitter API and tweepy
Web Scraping: Beautiful Soup
Computer Vision
OpenCV (Image processing, object detection, face detection, face recognition, …)
Quandl (open data: finance, stock prices, demographics, housing, …)
good job