Variational Autoencoder (VAE)
Here I discuss one of the two most popular classes of generative models for creating images.
Manim (3blue1brown library for math animations) basics
A long time ago I had a dream of a website, where mathematicians could exchange their ideas in a visual form, not as "canned" formulae. At that point I was basically broke and did not have enough money/passion to create a decent library myself. Years later a friend of mine showed me 3blue1brown youtube channel, where a guy was creating beautiful understandable videos about mathematics (shame that I already learnt everything they taught, though). Recently I found out that he actually open-sourced the library he created for making those animation. This post is about it.
Lasso regression implementation analysis
Lasso regression algorithm implementation is not as trivial as it might seem. In this post I investigate the exact algorithm, implemented in Scikit-learn, as well as its later improvements.
How does DeepMind AlphaFold2 work?
I believe that DeepMind AlphaFold2 and Github Co-pilot were among the most prolific advances of technology made in 2021. Two years after their initial breakthrough, DeepMind released the second version of their revolutionary system for protein 3D structure prediction. This time they basically solved the 3D structure prediction problem that held for more than 50 years. These are the notes from my detailed talk on the DeepMind AlphaFold2 system.
Overview of consensus algorithms in distributed systems - Paxos, Zab, Raft, PBFT
The field of consensus in distributed systems emerged in late 1970s - early 1980s. Understanding of consensus algorithms is required for working with fault-tolerant systems, such as blockchain, various cloud and container environments, distributed file systems and message queues. To me it feels like consensus algorithms is a rather pseudo-scientific and needlessly overcomplicated area of computer science research. There is definitely more fuzz about consensus algorithms than there should be, and many explanations are really lacking the motivation part. In this post I will consider some of the most popular consensus algorithms in the 2020s.
Популярная статья о распределенных вычислительных системах, блокчейне и криптографии в контексте выборов
В контексте прошедших выборов многие люди, не являющиеся техническими специалистами, заинтересовались тематикой блокчейна, криптографии и т.п. Меня попросили подготовить популярную статью на эту тему.
How to configure a private OpenVPN server (+ client)
Late September is the time of parliament elections in Russia. 2 weeks prior to the elections Russian government has banned several major VPN providers in order to prevent users from accessing banned opposition websites, through which dissidents coordinate counter-measures against divide-and-conquer tactics, employed by the regime. In this post I'll explain, how to set up a basic OpenVPN server and configure a client to overcome this obstacle.
Data structures for efficient NGS read mapping - suffix tree, suffix array, BWT, FM-index
In Next-Generation Sequencing bioinformatics there is a problem of mapping so-called reads - short sequences of ~100 nucleotides - onto a full string that contains them - the reference genome. There is a number of clever optimizations to this process, which I consider in this post.
Why Huffman trees require a bottom-up walk to be optimal?
Why greedy algorithm wouldn't work for Huffman trees?
A case study of 20PiB Ceph cluster with 100GB/s throughput
Recently we deployed a Ceph cluster that might be one of the more powerful in Russia in terms of both throughput and storage capacity. I'd like to discuss nuts and bolts of that system in this post.
Blog version 4
I just released a new version of my personal blog http://borisburkov.net, this time powered by Gatsby.js.
Asyncio ecosystem
I have a very bad developer experience with Asyncio. It is such a messy and overcomplicated system that I studied it over at least 3 times now. I figured, it's time to cut my losses and write a post about it!
DeepMind - Презентация AlphaFold в EBI
Два месяца назад весь мир облетела новость, что DeepMind выиграл известное соревнование по предсказанию 3D-структур белков CASP, порвав всех биоинформатиков с впечатляющим отрывом. Многие люди из мира биотеха теперь пытаются осознать, 'что это было'? Революция или эволюция, наука или инженерия, талант или финансирование? Волею судеб я когда-то оказался совсем недалеко от этой области науки, поэтому потратил несколько дней чтобы разобраться в деталях - а между тем в EBI приехал наводить мосты ведущий инженер проекта Эндрю Сеньор из DeepMind.
Amazon Alexa
Послушал двух парней из кембриджского офиса Амазона, работающих над Алексой. Составил общее впечатление о том, каково оно - работать в Амазон.
Focal loss and Average Precision
A simple loss function for multiclass classification with multiple classes that beautifully deals with class imbalance
Карьера в империи данных - Лекция дата-инженера из Facebook
Этой весной на ML/AI-конференции в Microsoft Research я коротко обсудил вопрос построения карьера дата-сайнтиста в IT-компаниях с Зубином Гарамани, профессором сильнейшего инженерного факультета Кембриджа и директором лабораторий искусственного интеллекта в Убер. Зубин тогда объяснил, что от ваших научных регалий обычно зависит та позиция, на которую вы устраиваетесь на работу, и роль в компании. И вот в это воскресенье я получил подтверждение его слов от Марека Романовича, дата инженера в Фейсбуке в Нью-Йорке.
Postgres roles
Postgres authentication and permission system sometimes feels like a total mess to me. This is a recap of how it works.
Docker users and user namespaces
After taking a break from DevOps for a few months and switching to other fields, I would always forget the details of how users within a docker container map to users on the host machine. This is a condensed recap of user mappings that should save me time, upon switching the contexts.
OpenStack, Kubernetes and OpenShift crash course for impatient - Kubernetes
Kubernetes is a system for orchestration of containerized applications that can be used to deploy your microservice-based websites to the cloud. Kubernetes is created by Google, based on their internal orchestration system Borg (although, codebase is re-written completely from scratch). Kubernetes is written mostly in Go programming languages and is open-source.
OpenStack, Kubernetes and OpenShift crash course for impatient - OpenStack
OpenStack is a pretty old standard for describing cloud resources and interacting with them. Most of its APIs were suggested around 2012. It is "Open" because multiple vendors that provide cloud services (including Rackspace and Red Hat) agreed to use the same API for interaction with them and called it OpenStack.
OpenStack, Kubernetes and OpenShift crash course for impatient - introduction
Much like a junkie from a russian anecdote, who started shouting "Jiggers, cops!" when they brought him to the police station, EBI in 2018 suddenly discovered the existence of cloud technologies.
Traction
MOST STARTUPS DON'T FAIL BECAUSE THEY CAN'T BUILD THE PRODUCT. MOST STARTUPS FAIL BECAUSE THEY CAN'T GET TRACTION.
BurkovBA.github.io is online!
I've been procrastinating over my blog for almost a year. Initially I wrote it in Angular in early 2017 and re-wrote everything in React in the last couple of weeks. At last, following Github's "ship early - ship often" motto, I shipped it today. Probably the most challenging aspect of the whole work was to make Github pages play nice with React SPA - I'll tell you how in this post.
Энигма, часть 5 - "Бисмарк" и "дебютантка"
В ходе "Битвы за Атлантику" в 41-ом году немецкий флот пытался отрезать Великобританию от морского сообщения с континентом и Штатами. У немцев было превосходство в военно-морском флоте, и на какое-то время им даже удалось установить вокруг островов морскую блокаду.
Энигма, часть 1 - Что такое "Энигма"?
Что вообще такое эта знаменитая "Энигма", которую все так стремились взломать, и зачем она была нужна?
Энигма, часть 0 - Британия во Второй мировой
Прежде чем перейти собственно к теме повествования, криптографии и Блетчли-парк, я хотел сказать пару слов об участии Британии в войне - чтобы дать контекст.
Энигма. Анонс
Все смотрели "Игру в Имитацию"? Камбербетч, конечно, прекрасен, а в жизни, конечно, всё было не так. Этот пост про математиков и инженеров из GC&CS (Government Code and Cypher School) во главе с Аланом Тьюрингом, нашедших уязвимости в немецких шифровальных машинах "Энигма" и "Лоренцå" во Вторую мировую войну, и спасших тем самым десятки или даже сотни тысяч соотечественников.
Facebook license
Несколько дней назад Facebook изменил лицензии ряда самых популярных своих open-source библиотек React, Flow, Jest и Immutable.js на стандартную MIT.