How to enter the school of data analysis from Yandex. Entering the school of data analysis School of Yandex admission

School selection takes place in three stages:

  1. Online testing: After completing the application form, you will receive an email with a link. Five hours are allotted for solving the tasks of the test.
  2. A written exam: for applicants to the Moscow branch of the ShAD, the exam will take place in person in Moscow at the end of May or at the beginning of June.
    Applicants to the branches and the correspondence department will take the exam online at the beginning of June. Only those who have successfully passed the online testing stage can participate in the written exam.
  3. Interview: at the end of June - beginning of July, for all those who successfully passed the first two stages, interviews will be held at the ShAD offices or via Skype.

Training

Upon admission to the ShAD, knowledge is tested within the framework of the general program, which includes the basic sections of higher algebra, mathematical analysis, combinatorics, probability theory, as well as the basics of programming. Examples of written exam tasks:

  • 2012 set
  • 2013 set
  • 2014 set
  • 2016 set
  • 2017 set

Paid training

Applicants who showed themselves well at the interview, but did not pass the general competition, will be able to start studying on a paid basis (only in the Moscow branch). Paid studies are no different from free ones - you need to complete all the same difficult tasks, meeting tight deadlines. Tuition costs 110,000 rubles per semester. If a student completes the semester with "good" and "excellent", the tuition fee for him is reduced to 55,000 per semester. Those who passed with "good" and "excellent" two sessions in a row continue to study for free.

Summer is the time for entrance exams. Right now, the selection to the Yandex School of Data Analysis is being completed - interviews are underway for those who have already passed the exam. The School of Computer Science teaches machine learning, computer vision, natural language text analysis and other areas of modern Computer Science. For two years, students study subjects that are not usually included in university programs, although they are in great demand both in science and in industry. You can study not only in Moscow - the School has branches in Yekaterinburg, Minsk, Kyiv, Novosibirsk, St. Petersburg. There is also a correspondence department, where you can study by watching video lectures and corresponding with the teachers of the Moscow School by mail.

But in order to enter the ShAD, you need to successfully go through three stages - fill out an application form on the site, pass the entrance exam and come for an interview. Every year senior students, graduates and postgraduates of Moscow State University, Moscow Institute of Physics and Technology, Higher School of Economics, ITMO, St. Petersburg State University, UrFU, NSU enter the ShAD, and not all of them cope with our tests. This year we received questionnaires from 3500 people, 1000 of whom were admitted to the exam, and only 350 passed it successfully.

For those who want to try themselves and understand what they are capable of, we have prepared an analysis of this year's entrance exam. The option we chose for you was completed by 56% of those who solved it. In this table, you can see how many people were able to solve each of the tasks in it.

But first, I would like to explain what we check with the exam and how we approach its preparation. In the very first years of the existence of the SAD, there was no written exam, since there were still few applications, and it was possible to talk with everyone who passed the online test in person. But then the interviews were longer; some graduates recall being interviewed for six hours, offering many challenging tasks. Then there were more applicants - and in 2012 a written exam appeared.

The curators of the Moscow ShAD are involved in the creation of the variant, one of which I am; with the selection of tasks they are assisted by colleagues from branches. The number of tasks in the variant has not changed much over these four years: at first there were seven of them, and last year there were eight. Each option has math problems (five to seven) and algorithm problems (one or two).

As for mathematics, we, of course, check whether applicants master the main sections of the program: algebra, calculus, combinatorics and probability theory. But what is important to us is not the knowledge that is achieved by cramming and forgotten a week after the test or exam - like terrible formulas from the table of indefinite integrals or the Student's distribution function; that is why we allow applicants to take any paper sources with them to the written exam. Much more valuable is understanding the essence of what is happening, as well as the ability to apply standard facts and methods in unusual situations. We also try to keep computational complexity to a minimum; even two-digit numbers have to be multiplied infrequently. So during the exam you will not find routine and tedious computational exercises, and many tasks will seem non-standard and maybe even Olympiad.

In the algorithm part, we avoid problems that require knowledge of specific data structures (search trees, hash tables, etc.) or algorithms (fast sorting algorithms, graph shortest path algorithms, etc.). In addition, we do not require applicants to write an implementation of an invented algorithm in any programming language; rather, on the contrary - in every possible way we discourage this. Indeed, in the written exam, we are most interested not in programming skills, but in the ability to clearly describe the algorithm and, if necessary, convince the reader that it satisfies the restrictions on the running time and the amount of allocated memory. However, decisions containing code in any language that we can read are also accepted, but they are more difficult to check and, in addition, they still must be accompanied by a rationale for correctness.

Task 1

Find the limit of the sequence (a n) for which

Answer


Decision

Let us first prove that the sequence converges. If a a n< 0 , then a n+1< 0 , so it is bounded from above. Let's compare a n and a n+1:


We see that at a n ∈(-1;0) there is an inequality a n< a (n+1) , that is, the sequence is increasing. By the Weierstrass theorem, it has a limit. To find it, let's go to the limit in our recurrence relation:
whence the limit can be one of the numbers 0, -1, and 4. It's easy to see that this is 0.

Task 2

On a plane tiled with identical rectangles with sides 10 and 20 (rectangles adjoin sides), draw a random circle of radius 4. Find the probability that the circle has common points with exactly three rectangles.

Answer


Decision

We will monitor the position of the center of the circle. It is clear that we can restrict our consideration to the interior of a single rectangle. It is easy to see that in order for the circle to intersect exactly three rectangles, two conditions must be met: (1) the distances from the center to the two nearest sides of the rectangle must be less than 4; (2) the distance to the nearest vertex of the rectangle must be greater than 4. Knowing this, we can draw a set of points that satisfy these conditions.

Therefore, the desired probability is equal to

Task 3

Dima and Vanya take turns filling in the size matrix 2n×2n. Vanya's goal is to make the resulting matrix have an eigenvalue of 1, and Dima's goal is to prevent him. Dima goes first. Do any of them have a winning strategy?

Answer

With the right strategy, Vanya will win.


Decision

The resulting matrix BUT will have an eigenvalue of 1 if the matrix A - E will be degenerate. Vanya can achieve this, for example, in the following way. After Dima entered some element aij, Vanya enters a new element aik on the same line so that a ik -δ ik =-(a ij -δ ij), where δij is the Kronecker symbol. Then the sum of the numbers in each row of the matrix A-E will be equal to zero, that is, the matrix A - E will be degenerate.

Task 4

Find the determinant of a matrix A=(aij), where

Answer


Decision

We use the formula Subtract the previous one from each row of the matrix, and then the previous one from each column. The resulting matrix will look like:


Continuing the argument by induction, we make sure that the determinant of the original matrix is ​​equal to the determinant of the unit matrix, i.e. one.

Task 5

Given two arrays of integers a and b, and all elements b different. It is required to find a set of indices i_1< i_2 <… < i_k , for which the set a,...,a is a permutation of the elements of the array b, and the difference i_k - i_1 minimum possible. Time limit - O(nk)(but maybe you can faster), from memory - O(n).

Decision

This can be done in one pass through the array a. Every time we encounter an array element b, we write it and its number into special arrays. At the same time, we maintain segment I in these arrays, on which we hope to find all the different elements b. It is clear that if the next element of the array a coincides with the first element of the segment I, then I clearly cannot be the shortest a segment that satisfies the condition of the problem, and we can move its left end. If at the next step we understand that I contains all distinct elements b, then I is a candidate for the answer; in this case, we also shift its left end.

Grade O(n) obvious from memory. Grade O(nk) complexity can be justified as follows: we do everything in one pass (hence n) and at each step must look for an element in the array b(hence k). It is clear that the algorithm can be improved: if we first sort b and use binary search, we get O(n log k). If you use perfect hashing, then you can achieve complexity O(n+k).

Task 6

In 2222, volleyball tournaments are held according to a new system. They say the A team surpasses team B if A beats B or any team that beats B. Each pair of teams plays 1 time. Ties are ruled out by volleyball rules. The team that outperforms all other teams is declared the champion. (a) Prove that there must be a champion (b) Prove that there cannot be exactly two champions.

Decision

Let's agree that each team for the tournament receives points equal to the number of teams surpassed by it. We first prove the following simple lemma:

Lemma. Let team E not outperform team K. Then K scored more points than E.

Proof. If E does not exceed K, then K has defeated Team E, as well as all the teams that Team E has won.

Now let X be the team beaten by team E. If E beat X, then K also beat X. So K beats X. If E beat team F, who beat X, then note that K also won y F. So, K beat F, which beat X, that is, K beats X. In total, K beats all the teams that beat E, and even E to boot, that is, at least one team more than E The lemma is proved.

(a) Let A be the team with the most points. Let's prove that A is a champion. Let's say it's not, then there's team B that A didn't beat. By the lemma, we get that B has earned more points than A. A contradiction.

(b) Suppose we have two champions: A and B. They played each other; let, for example, A win. Since B is superior to all other teams (and A in particular), then B defeated some team that won against A.

Assume for a start that there are teams that won both A and B. Then it can be shown that the one of them (let's call it C), which scored the most points, will be the third champion. Indeed, let E be the team that was not beaten by C. Then, firstly, E won both A and B, and secondly, E earned more points than C. A contradiction.

Suppose now that there are no teams that have won both A and B. Consider the set of all such teams that have won A but lost B. Note that it is non-empty (see above). Among them, take the team with the highest number of points. Then, using the lemma, we can establish that this team is the third champion.

Task 7

Calculate Integral

Yandex opens a new enrollment to the School of Data Analysis. These are free two-year evening courses for those who want to get an education in the field of data processing and analysis and extracting information from the Internet. The school requires good mathematical preparation and is designed primarily for students and young graduates of engineering and mathematical specialties.

How to proceed

You need to fill out a form on the school website until May 15. After that, you will receive an email with a link to an online test in mathematics and basic programming. All those who successfully complete the test will be invited by the school to a written exam, which will take place in late May - early June. The best results of the exam will have to go through an interview, following which a final decision will be made.

Training program

On the school website, you can study the exam tasks of past years and find out what you should prepare for. You can get acquainted with the teachers of the school and learn about the new direction "Big Data" at the Open Day of the ShAD. It will take place April 19 in the Moscow office of Yandex, to participate you need to register.

School classes are held on weekday evenings. You can study at school in person or in absentia, according to video lectures. During training or after graduation, students can take an internship at Yandex.

The School of Data Analysis has existed since 2007 and has graduated more than 300 specialists, many of whom are engaged in science, work for Yandex and other large IT companies in Russia and abroad. Branches of the ShAD exist in St. Petersburg (within the Computer Science Center), Novosibirsk, Yekaterinburg, Minsk and Kyiv.

Hello! We are glad to congratulate you on your admission to the School of Data Analysis! Closer to September, the curator of your branch will write about organizational issues.

So, I'm in school. And, almost sure, the oldest student there. There will be no problems with couples, you can even go to the skating rink (except that the rides with the instructor may have to be postponed for the weekend). Now how did I do it.

An acquaintance offered to try his luck: "you can." The online selection was hell and darkness, I suffered for four hours. Although, to be honest, I cheated a little: in programming assignments, I simply translated programs from pseudocode to C ++, and even one task on matrices, without picking up a key, I simply solved it in the forehead with an executable. I didn’t know what a “positive inertia index” was (did I spell this name correctly?) - I had to look it up, it turned out to be just the number of positive elements in the diagonal expansion of a quadratic form.

Well, the second stage is the face-to-face exam. I bought a reading room, overlaid with notes and began to prepare. Most of all I was afraid of terrible integrals: any first-timer will outdo me in this. Well, to the point. This is what the Yandexoids offered us at the exam (the conditions of the tasks are reduced).

  1. How many ways are there to get from (0,0,0) to ( n, 2n, 3n) if it is possible to take +1 steps along any of the axes?
  2. Find the 319th derivative at the zero of the function (x²+17) / (x 4 −5x²+4)
  3. How many permutations commute with (123)(456)?
  4. In an equilateral triangle ABC area 1 choose a point M. Find the expectation of the area ABM.
  5. ∫ 1 / √1+e x dx
  6. Show that an integer matrix does not have rational (non-integer) eigenvalues.
  7. There are cans of gasoline on the circular road. There is a car with a known fuel consumption and an empty tank of unlimited capacity. For O( n) operations to find out from which canister you need to start, so that, collecting fuel, drive all the way and not stop empty (or say that this is impossible).

I solved 6 problems - except, of course, the integral. True, I was worried, and I decided 2 and 3 incorrectly (with the right technique!)

At the interview, they asked more about the personal: why did you decide to go to the School, is it not difficult for you to work, nothing that everyone is younger than you? The answer was delayed for four days (in the early days I periodically shook mail via the web when my partner turned away). And finally they answered.

Positive admission experience. I remember myself as a fighter. I finally bought a reader (and I do not part with the device, the purchase is on point).

Negative experience. I should have calmed down, then tasks 2 and 3 would have come out. It was not worth solving the integral at all - or in preparation to devote more time to integrals. Finally, the preparation in the form in which it was, gave little. I tightened up the theorems, remembered how this or that thing is justified, but of this goodness, only a record of the permutations was required.

Summer is the time for entrance exams. Right now, the selection to the Yandex School of Data Analysis is being completed - interviews are underway for those who have already passed the exam. The School of Computer Science teaches machine learning, computer vision, natural language text analysis and other areas of modern Computer Science. For two years, students study subjects that are not usually included in university programs, although they are in great demand both in science and in industry. You can study not only in Moscow - the School has branches in Yekaterinburg, Minsk, Kyiv, Novosibirsk, St. Petersburg. There is also a correspondence department, where you can study by watching video lectures and corresponding with the teachers of the Moscow School by mail.

But in order to enter the ShAD, you need to successfully go through three stages - fill out an application form on the site, pass the entrance exam and come for an interview. Every year senior students, graduates and postgraduates of Moscow State University, Moscow Institute of Physics and Technology, Higher School of Economics, ITMO, St. Petersburg State University, UrFU, NSU enter the ShAD, and not all of them cope with our tests. This year we received questionnaires from 3500 people, 1000 of whom were admitted to the exam, and only 350 passed it successfully.

For those who want to try themselves and understand what they are capable of, we have prepared an analysis of this year's entrance exam. The option we chose for you was completed by 56% of those who solved it. In this table, you can see how many people were able to solve each of the tasks in it.

But first, I would like to explain what we check with the exam and how we approach its preparation. In the very first years of the existence of the SAD, there was no written exam, since there were still few applications, and it was possible to talk with everyone who passed the online test in person. But then the interviews were longer; some graduates recall being interviewed for six hours, offering many challenging tasks. Then there were more applicants - and in 2012 a written exam appeared.

The curators of the Moscow ShAD are involved in the creation of the variant, one of which I am; with the selection of tasks they are assisted by colleagues from branches. The number of tasks in the variant has not changed much over these four years: at first there were seven of them, and last year there were eight. Each option has math problems (five to seven) and algorithm problems (one or two).

As for mathematics, we, of course, check whether applicants master the main sections of the program: algebra, calculus, combinatorics and probability theory. But what is important to us is not the knowledge that is achieved by cramming and forgotten a week after the test or exam - like terrible formulas from the table of indefinite integrals or the Student's distribution function; that is why we allow applicants to take any paper sources with them to the written exam. Much more valuable is understanding the essence of what is happening, as well as the ability to apply standard facts and methods in unusual situations. We also try to keep computational complexity to a minimum; even two-digit numbers have to be multiplied infrequently. So during the exam you will not find routine and tedious computational exercises, and many tasks will seem non-standard and maybe even Olympiad.

In the algorithm part, we avoid problems that require knowledge of specific data structures (search trees, hash tables, etc.) or algorithms (fast sorting algorithms, graph shortest path algorithms, etc.). In addition, we do not require applicants to write an implementation of an invented algorithm in any programming language; rather, on the contrary - in every possible way we discourage this. Indeed, in the written exam, we are most interested not in programming skills, but in the ability to clearly describe the algorithm and, if necessary, convince the reader that it satisfies the restrictions on the running time and the amount of allocated memory. However, decisions containing code in any language that we can read are also accepted, but they are more difficult to check and, in addition, they still must be accompanied by a rationale for correctness.

Task 1

Find the limit of the sequence (a n) for which

Answer


Decision

Let us first prove that the sequence converges. If a a n< 0 , then a n+1< 0 , so it is bounded from above. Let's compare a n and a n+1:


We see that at a n ∈(-1;0) there is an inequality a n< a (n+1) , that is, the sequence is increasing. By the Weierstrass theorem, it has a limit. To find it, let's go to the limit in our recurrence relation:
whence the limit can be one of the numbers 0, -1, and 4. It's easy to see that this is 0.

Task 2

On a plane tiled with identical rectangles with sides 10 and 20 (rectangles adjoin sides), draw a random circle of radius 4. Find the probability that the circle has common points with exactly three rectangles.

Answer


Decision

We will monitor the position of the center of the circle. It is clear that we can restrict our consideration to the interior of a single rectangle. It is easy to see that in order for the circle to intersect exactly three rectangles, two conditions must be met: (1) the distances from the center to the two nearest sides of the rectangle must be less than 4; (2) the distance to the nearest vertex of the rectangle must be greater than 4. Knowing this, we can draw a set of points that satisfy these conditions.

Therefore, the desired probability is equal to

Task 3

Dima and Vanya take turns filling in the size matrix 2n×2n. Vanya's goal is to make the resulting matrix have an eigenvalue of 1, and Dima's goal is to prevent him. Dima goes first. Do any of them have a winning strategy?

Answer

With the right strategy, Vanya will win.


Decision

The resulting matrix BUT will have an eigenvalue of 1 if the matrix A - E will be degenerate. Vanya can achieve this, for example, in the following way. After Dima entered some element aij, Vanya enters a new element aik on the same line so that a ik -δ ik =-(a ij -δ ij), where δij is the Kronecker symbol. Then the sum of the numbers in each row of the matrix A-E will be equal to zero, that is, the matrix A - E will be degenerate.

Task 4

Find the determinant of a matrix A=(aij), where

Answer


Decision

We use the formula Subtract the previous one from each row of the matrix, and then the previous one from each column. The resulting matrix will look like:


Continuing the argument by induction, we make sure that the determinant of the original matrix is ​​equal to the determinant of the unit matrix, i.e. one.

Task 5

Given two arrays of integers a and b, and all elements b different. It is required to find a set of indices i_1< i_2 <… < i_k , for which the set a,...,a is a permutation of the elements of the array b, and the difference i_k - i_1 minimum possible. Time limit - O(nk)(but maybe you can faster), from memory - O(n).

Decision

This can be done in one pass through the array a. Every time we encounter an array element b, we write it and its number into special arrays. At the same time, we maintain segment I in these arrays, on which we hope to find all the different elements b. It is clear that if the next element of the array a coincides with the first element of the segment I, then I clearly cannot be the shortest a segment that satisfies the condition of the problem, and we can move its left end. If at the next step we understand that I contains all distinct elements b, then I is a candidate for the answer; in this case, we also shift its left end.

Grade O(n) obvious from memory. Grade O(nk) complexity can be justified as follows: we do everything in one pass (hence n) and at each step must look for an element in the array b(hence k). It is clear that the algorithm can be improved: if we first sort b and use binary search, we get O(n log k). If you use perfect hashing, then you can achieve complexity O(n+k).

Task 6

In 2222, volleyball tournaments are held according to a new system. They say the A team surpasses team B if A beats B or any team that beats B. Each pair of teams plays 1 time. Ties are ruled out by volleyball rules. The team that outperforms all other teams is declared the champion. (a) Prove that there must be a champion (b) Prove that there cannot be exactly two champions.

Decision

Let's agree that each team for the tournament receives points equal to the number of teams surpassed by it. We first prove the following simple lemma:

Lemma. Let team E not outperform team K. Then K scored more points than E.

Proof. If E does not exceed K, then K has defeated Team E, as well as all the teams that Team E has won.

Now let X be the team beaten by team E. If E beat X, then K also beat X. So K beats X. If E beat team F, who beat X, then note that K also won y F. So, K beat F, which beat X, that is, K beats X. In total, K beats all the teams that beat E, and even E to boot, that is, at least one team more than E The lemma is proved.

(a) Let A be the team with the most points. Let's prove that A is a champion. Let's say it's not, then there's team B that A didn't beat. By the lemma, we get that B has earned more points than A. A contradiction.

(b) Suppose we have two champions: A and B. They played each other; let, for example, A win. Since B is superior to all other teams (and A in particular), then B defeated some team that won against A.

Assume for a start that there are teams that won both A and B. Then it can be shown that the one of them (let's call it C), which scored the most points, will be the third champion. Indeed, let E be the team that was not beaten by C. Then, firstly, E won both A and B, and secondly, E earned more points than C. A contradiction.

Suppose now that there are no teams that have won both A and B. Consider the set of all such teams that have won A but lost B. Note that it is non-empty (see above). Among them, take the team with the highest number of points. Then, using the lemma, we can establish that this team is the third champion.

Task 7

Calculate Integral