Eugene Ciurana

Eugene Ciurana is the director of knowledge representation science and technology at Meltwater, where his team is building the most advanced competitive intelligence system in the world by contextualizing 1.2 trillion documents and 12 million unique entities from 10 million data sources worldwide. His previous work as an AI technology principal led to successful exits, products, and services at Cosmify, Yahoo, Summly, and Badoo. Eugene’s career evolved from high availability and scalability systems to deep learning, exploratory analysis of unstructurеd data, and applied ML over the last 7 years. He has served in direct tech advisory roles at MuleSoft (acquired by Salesforce for $6.5B), Horizons Ventures, TA Venture, JP Morgan Chase, Cloudera, Kearny Jackson, Rubikloud, and Box. Eugene can be reached at https://ciurana.eu or in the Freenode and OFTC networks under /nick pr3d4t0r.

Wanna Be a Data Scientist? Here’s How You Start!

11:30-12:20 Novice

Data scientists and engineers have some overlapping skills, but transitioning from one to the other requires more than coding skills. This session explains how to improve an engineer’s hands on skills to start a path toward exploratory, applied data science coding and system design.

  • Background
  • Scientists vs engineers: conflicting project and career objectives
  • Data science basics
    • Math
    • Tools
    • Exploratory analysis
  • Unstructured data
    • ETL
    • Data models vs knowledge representation
    • Information vs data
  • Jupyter Lab – hands on introduction
    • Python
    • R
    • Other languages and tools
  • The path to production
    • Jupyter Notebook vs Lab
    • The development toolchain
    • Unit tests
    • Reproducible results
  • Cloud deployment
    • Docker containers
    • Edge computing
    • AWS, Google Cloud
  • Use case – putting it all together
  • Q&A

Machine Learning and Data Science Tools

10:30-11:20 Novice

ML, NLP, and data science applications require a blend of programming languages, databases, libraries and services. This session explains how to decide on the tools for the production, integration, and development environments, from the desktop to the largest computational network.

  • Background
  • Team needs: scientists vs engineers
    • Programming
    • Python
    • R
    • Kotlin
    • Java
    • Others
  • Development environments
    • Native
    • IDE
    • Jupyter vs Zeppelin
  • Database selection
    • Application driven
    • Document DB
    • Graph DB
    • Hybrids
  • DevOps
    • Easy to code and debug, easy to deploy and support
    • Containers: when and why
    • VMs: when and why
  • Deep learning tools integration
    • Tensorflow
    • Azure ML
  • Cloud deployment
    • Cheap and it works
  • Putting it all together
  • Q&A

Find the Terrorist!

Day 1 - 27th Nov 16:20-17:10 Main Hall #Influencers Novice

Deeper discussions about data science and how various problems were solved during the development phase – how do you go about finding the one bad guy of interest among 500 million unique entities?