    Druid - Interactive Analytics At Scale

    Druid – один из полезных и популярных инструментов в мире Больших Данных. Именно эта OLAP система позволяет эффективно обрабатывать, хранить и запрашивать данные. Что и подтверждает востребованность Druid среди инструментов в среде обработки Больших Данных
    С Владимиром Иордановым мы поговорим о том, как работает Druid, из чего он состоит и каковы его возможности. Владимир познакомит нас с компонентами Druid, расскажет об архитектуре кластера, о том как проходит обработка данных. Мы сможем понаблюдать на практике, как с ним работать.

    О спикере:
    🔸 Владимир Иорданов – BigData Tech Lead в Lohika и уже более шести лет работает в проектах, связанных с Большими Данными. Его последний проект был непосредственно связан с Apache Druid. Во время работы на этом проекте Владимир получил большой опыт работы с поддержкой и разработкой под эту систему в production. Этим опытом наш спикер и поделится на BigData Odesa #TechTalks .

    🔹Где: онлайн
    🔹Когда: 17.12.2020 в 19:00
    🔹Язык доклада: русский
    🔹Регистрация обязательна
    🔹Вход: donation. Будем вам признательны за перечисление любой комфортной для Вас суммы в благотворительный фонд “Корпорация монстров”. Способы перечисления помощи есть на странице регистрации.

    Регистрация на онлайн-трансляцию => https://docs.google.com/forms/d/e/1FAIpQLSdALs-5FgvDKwoAVqJkbvYKOR0UTLrWTvMszDUGl1HIylISrA/viewform

    Как всегда, вас ждет интересная тема и полезный вечер. Вы научитесь эффективно применять Druid в своих проектах при построении системы обработки больших данных.

    Putting ML in Production

    A guide and case study on MLOps for software engineers, data scientists and product managers. Deploy ML to production for a real product with live data using open source tools.

    Stanford MLSys Seminar Series

    In this seminar series, we want to take a look at the frontier of machine learning systems, and how machine learning changes the modern programming stack. Our goal is to help curate a curriculum of awesome work in ML systems to help drive research focus to interesting questions.

    MSeg: A Composite Dataset for Multi-domain Semantic Segmentation

    MSeg is a composite dataset that unifies semantic segmentation datasets from different domains. In this dataset, authors reconcile the taxonomies and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.

    Understanding coordinate systems and DICOM for deep learning medical image analysis

    Multiple introductory concepts regarding deep learning in medical imaging, such as coordinate system and DICOM data extraction from the machine learning perspective.

    Awesome GPT-3

    This evolving GPT-3 collection includes links to some of the best demos and tutorials around the web. This is a great rabbit hole for anyone interested in understanding how GPT-3 works and where it's going.

    Object Detection from 9 FPS to 650 FPS in 6 Steps

    This article is a practical deep dive into making a specific deep learning model (Nvidia’s SSD300) run fast on a powerful GPU server, but the general principles apply to all GPU programming. The SSD300 is an object-detection model trained on COCO, so output will be bounding boxes with probabilities for 81 classes of object.

    ​​The 2020 Data & AI Landscape

    In this post, you will learn about:
    — Key trends in data infrastructure
    — Key trends in analytics & enterprise AI
    — The 2020 landscape
    — Who’s in, who’s out — noteworthy IPOS, M&A and additions

    Using reinforcement learning to personalize AI-accelerated MRI scans

    Our early experiments with the fastMRI data set show that our models outperform the previous active MRI acquisition state of the art over a broad range of acceleration factors.

    Anti-Patterns in NLP (8 types of NLP idiots)

    In this talk, you will learn about common anti-patterns that happen in the industry while solving text problems.

    Interactive, Scalable Dashboards with Vaex and Dash

    Vaex and Dash are open-source libraries that make it easy to build interactive dashboards on the web for millions, and even billions, of data samples using just your Python skills. This tutorial shows what you can do with these libraries and how to use them.

    ​​Deep Learning with PyTorch

    Download a free copy of the full book and learn how to get started with AI / ML development using PyTorch. This book provides a detailed, hands-on introduction to building and training neural networks with PyTorch, a popular open-source machine learning framework.

    How to Do Data Exploration for Image Segmentation and Object Detection

    In this article, the author will share with you how he approaches data exploration for image segmentation and object detection problems.

    Data Science Digest (June 2020)

    Hi folks, I’m happy to share with you the latest Data Science Digest issue featuring Data Science & Machine Learning goodies for June 2020. Please upvote on Habr and applaud on Medium.

    Habr (RU) — https://bit.ly/30lGGUR
    Medium (EN) — https://bit.ly/3f4xbNS
    The Ultimate Guide to Deploying Machine Learning Models

    This multi-part series is a great resource for learning about model deployment. Covers a variety of topics, including common pitfalls, interfaces, model registries, A/B testing and more.

    OpenCV Social Distancing Detector

    In this tutorial, you will learn what social distancing is and how OpenCV and deep learning can be used to implement a social distancing detector.

    Beyond fashion: Deep Learning with Catalyst

    Step-by-step tutorial for setting up a deep learning pipeline with Catalyst and deploying the model to production.

    How to Scale Data With Outliers for Machine Learning

    In this tutorial, you will discover how to use robust scaler transforms to standardize numerical input variables for classification and regression.