Self driving: scratching the surface of the topics

The “self driving” issue is becoming central in the news mainstream and it will probably re-shape the vehicle market in the next few years.

When we think about “self driving” and its technical challenges it is always difficult to be up-to-date on the emerging ideas and concepts.

It’s easy to accept the wrong inference that, since the web is full of technical info and scientific papers, the knowledge is also readily usable. Unfortunately, scientific papers are often affected by hypertechnical (and sometimes self-referential) jargon and by the hyperproduction of publications.

The starting point

In order to try to extract some insights regarding this topic, we’ve analyzed several articles published in the arXiv database.

ArXiv seems to be a good starting point: it is probably the most authoritative preprint database and its API is simple and fast.

Our study is only at the first step but we wanted to share some early findings and we are glad to receive ideas for the next steps.

This analysis does not claim to be an exhaustive overview of the topic. It’s only an attempt to scratch the surface of the mass of information.

This work is also a chance for us to try some techniques that could be useful to get insights from a corpus of technical documents without the knowledge of an average PHD in each specific topic (anyway having a PHD title isn’t enough to ensure you the success :) ).

Technique

We’ve tried several approaches and the following is the one that seems to give some promising starting points.

- We’ve got a list of papers from the arXiv API, starting from some generic queries related to the topic “self driving”.

- We’ve filtered the results removing obvious non-related topics (i.e. : all the astrophysics papers)

- We’ve transformed the obtained corpus to a matrix (tf-idf with some tweaks) and created a graph based on the connections among words.

- In order to reduce the noise and to extract clearer data, we’ve detected the communities inside this graph using a hierarchical algorithm.

The final product of this process is a tree that allows us to explore some of the most promising n-grams and their closeness in the papers.

The last step of this analysis is similar to what the gold digger does with his sieve. We have tried to analyze more deeply some of the n-grams, through the reading of some related papers and other material that can be found on the web.

For this small post we’ve selected the top n-grams representing each cluster, trying to contextualize the underlying topics.

Topics

Topics emerging from the corpus of papers related to self driving

- machine learning algorithms, deep convolutional neural, machine learning techniques, deep neural network, computer vision, lane detection

This one is the most expected topic because it is related to some of the keywords prevailing in the soft scientific news and in the modern job titles.

Machine Learning algorithms and deep neural nets are some of the techniques strongly related to the improvement of the effectiveness of the “computer vision_”_.

- iterative consensus clustering, learning method, gaussian process

Iterative consensus clustering is an interesting Ensemble Method that allows to mix the outcome of several algorithms in order to improve the precision of the classification.

Gaussian Processes are everywhere in statistics and in physics so it is not a surprise to find them here. Again, in this cluster as well, we find the learning methods that are “monopolizing” the big data analysis.

- attacker models, malicious users

This pair is really interesting because it shows us that the security issue is taken into account inside the scientific community.

With the increasing number of interacting systems and of autonomous algorithms and with the constant production of a huge amount of data, the realm of weak points that can be exploited by a malicious user will increase.

-autonomous vehicle control, decision control, linear program, game theory, artificial intelligence, autonomous vehicle

Interestingly, the (classic) linear programming, game theory and the artificial intelligence (in its most general sense) are correctly a key topic, separated from machine learning and neural nets.

- kalman filter, collision avoidance, autonomous navigation

Kalman filter is a fast and reliable technique used extensively in robotics in order to filter out the statistical noise. Here it is related to the topic of “collision avoidance” that is (obviously) not a secondary issue when it comes to self driving cars!

- fuzzy inverse model, learning algorithm, address problem, internet things

Fuzzy models are another topic strongly related to this kind of problems. The idea of using fuzzy constraint is not new but it is still useful.

More generally, the idea of some kind of fuzzy logic or a set of fuzzy constraints is a classic topic.

Here we found also the connection with the IoT (Internet Of Things), another refrain pumped up by several sponsors.

- kitti oxford robotcar, camera without relying, complex urban environments, deep semantic segmentation, lane markings, equipped monocular camera

This group contains useful information for someone wanting to try to play with algorithms related to the main topic.

Kitti oxford robotcar is a reference to the huge Oxford RobotCar Dataset: http://mrg.robots.ox.ac.uk/the-oxford-robotcar-dataset/.

- deep reinforcement learning, markov decision process

We expected to find some markov technique more often in this analysis.

Markov techniques are broadly used in artificial intelligence because they are powerful but, at the same time, are feasible in terms of computational requests.

We find another deep learning technique in this cluster: deep learning which is heavily sponsored nowadays.

- kinodynamic motion planning, finite elements, constrained optimization problem

This group of words in really interesting to us because we’ve never heard anything about kinodynamic motion planning (our ignorance). It’s nice because it sounds like technobubble from Star Trek and so it reminds us of the 90s.

After a brief study now we know that it is a class of problems for which velocity, acceleration, and force/torque bounds must be satisfied, together with kinematic constraints such as avoiding obstacles.

Conclusions

Like any well financed and also much exposed target, the “self driving” issue is a collector of interesting new ideas (and also of boring and recycled ones).

This topic is particularly multifaceted and we are sure that in the near future some interesting studies will flood in other sectors.

It will also be a huge task to align laws and regulations in order to embrace this change of paradigm. In this world where the market of science and tech seems to be self sustained, the target of keeping tech (not pure science) bound to the needs of humanity seems to be the real difficult task.

Who

We are Elif Lab, a data company that provides analyses on politics, society and institutions.

We are also working on a project called ThinkingAbout.EU that analyzes various data coming from the European institutions.