Loading...
「ツール」は右上に移動しました。
利用したサーバー: wtserver3
88いいね 11157回再生

(Easy) High Performance Text Processing in Machine Learning by Daniel Krasner

See the full post here: www.hakkalabs.co/articles/easy-high-performance-te…

This talk covers rapid development of high performance scalable text processing solutions for tasks such as classification, semantic analysis, topic modeling and general machine learning. We demonstrate how Python modules, and in particular the Rosetta Python library, can be used to process, clean, tokenize, extract features, and finally build statistical models with large volumes of text data. The Rosetta library focuses on creating small and simple modules (each with command line interfaces) that use very little memory and are parallelized with the multiprocessing package. We will touch on LDA topic modeling and different implementations thereof (Vowpal Wabbit and Gensim). The talk will be part presentation and part "real life" example tutorial.

ABOUT DATA COUNCIL:
Data Council (www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.

FOLLOW DATA COUNCIL:
Twitter: twitter.com/DataCouncilAI
LinkedIn: www.linkedin.com/company/datacouncil-ai
Facebook: www.facebook.com/datacouncilai
Eventbrite: www.eventbrite.com/o/data-council-30357384520

コメント