Image processing, voice recognition, text interpretation – Régens artificial intelligence (AI) developments, 2018

2018/ 17/12

In the past period of time we have participated in several AI-based researches, and we have developed independent prototypes and solutions as well. Getting to the end of the year, we present some of these areas of application hoping you find it useful for your business practice.

The number of potential applications and uses of artificial intelligence increases from day to day, so we are getting closer to a future that a few years ago seemed unimaginable.

New development levels of AI and machine learning can lead to huge transformations for every business. The automation wave in companies based on these technologies exempts skilled employees from monotonous, repetitive tasks, and thus creates new opportunities for more efficient distribution of company resources.

In exploitation the potential of artificial intelligence, we do not even have to say that we ourselves are actively involved!

AI in business

Depending on the problem to be solved, the first step is to gather and clean large amounts of relevant data. This data is used by the algorithms of artificial intelligence to detect patterns and rules. Consequently, the targeted collection and tedious cleaning of data are decisive and time-consuming tasks, but these represent the spirit of the program. Interestingly, over-displaying a single type of data in the database may obviously “bias” the system to detect the patterns contained in them, but this feature makes these applications well-adjustable to specific needs. Three of the most important applications utilizing machine learning are image processing, voice recognition and text interpretation.

We have worked with these utilizations among others.

Image processing (object identification)

The most important requirement for a machine when it comes to image processing is - similar to human vision and thinking - to be able to interpret the images made available to it and to recognize various objects on these. This process is also called labelling and this is one of the most widely applicable areas of artificial intelligence.

We have created our own image processing application, which based on a crowdsourcing database, thanks to its easy access. There are nearly 9 million images in this dataset, annotated by volunteers to approximately 15 million labels. As a result, we developed a program that can categorize and put into bounding boxes objects and creatures on any uploaded image or video.

A similar solution is used for the AI-based invention of the Chinese Baidu company, which aims to help the visually impaired. The device looks like a Bluetooth headset that is equipped with a camera. In addition to characterizing the image seen, it can highlight important elements for the user. For example, it recognizes street lights, product labels, tells you what is in the fridge, and even indicates when a familiar face is approaching.

AI-based image recognition can easily transform e-commerce as well! Imagine a situation where you walk around the street and see a product you like, so you quickly take a picture of it. You upload this image to a sophisticated system with an enormous database and it shows you the product and its data, or the commodity closest to it. This way it exempts you from a lengthy search phase. In order to realize this practical vision, we have created a prototype that you can use to search for footwear using pictures. In case of exact match, the program will show the product, otherwise it will indicate the most similar footwear in the dataset.

This type of image search can be very useful if one of the documents on a huge storage contains a particular image. With our ongoing development, anyone with possession of the image can specify its source documents. As an area of ​​application, we can also mention the automatic identification of defective products on a production line or the counting of traffic by type of vehicle.

Could you make a great use of artificial intelligence in your own company? Learn more about our AI solutions!

Voice and speech recognition (signal processing)

Voice/speech recognition allows machines to interpret human speech and voices from other sources. This enables us to handle our smart devices more quickly and more comfortably, and to give our devices the ability to recognize sounds.

Voice recognition-based solutions can already be found in many places. We can communicate with Alexa, Siri, and their mates using them, but they also work for us in the background when we listen to Spotify. Pre-written algorithms analyse many songs. By evaluating our user activity - listening, liking, skipping, switching - they categorize our musical taste and compile Discover weekly playlists using songs with similar metadata every week.

YouTube also uses speech recognition algorithms to generate subtitles. This function works very well in English but it is not available in Hungarian. Starting from this problem we have created our own subtitling system!

More than 500 GB of audio and subtitle files were used as a teaching database. They needed to be cleaned and corrected for precise timing and unusable data had to be filtered out. After a long analysis, the program became capable of continuous subtitling of new audio materials. Only rarely occurring words that are not yet known by the system and soundtracks with higher noise mean problems to be solved, which - given our own resources available - is quite a good performance, as far as we are concerned. Such a type of artificial intelligence can be a great help for anyone who needs to search terms in video or audio files.

This type of speech recognition (in a good case) covers the full vocabulary of a language. At the same time, there are systems that only require the interpretation of some instructions or questions. Developing such programs is an easier task and recognizing the given commands will result in a much smaller error rate.

Text interpretation

The process of text interpretation basically means information retrieval from large amounts of text-based data. Natural language processing (NLP) is essential for this procedure, which enables the machine to understand and process everyday language manifestations. Without this, computers might be able to understand the meaning of each word, but NLP helps machines to interpret words in context.

Natural Language Processing: what is it and how can you put it to use?

Hungarian is far from simple and it is hard to be modelled as a language, so teaching the mentioned interpretation processes to machines is not an easy task. In the light of this, we are proud of the fact that we have successfully created our own search engine, which we are continuously improving. Seekra is able to handle the connections of Hungarian language in a unique way. With advanced semantic knowledge and automatic completion and synonym recognition capabilities, we have broken a great number of language barriers.


Make searching a unique experience on your site with the Seekra intelligent search engine optimized to Hungarian language.


A widely used application of AI-based text interpretation is sentiment analysis. In the process of this, an algorithm analyses the given text and specifies the emotional state in which the creator was. The simplest form of this is the classification into positive and negative groups. However, a more advanced system can discern the smallest differences in emotional states. Application options include analysis of client reactions, feedbacks, comments, or social media expressions, and it can even help with automatic literary translation. In addition, a properly trained system can be used to increase the efficiency of customer service activities. By analysing the conversations, you can get an overall picture of the performance of your staff and define the words and phrases that bring customer satisfaction. Automated customer service is just one step away from here!

With NLP you can create workplace assistants capable of generating texts and data out of enormous magnitude of content in an easy-to-understand, intelligible way. This is of paramount importance in business life, as it is easy to reach objective, data-based decision-making.

The enumerated uses are only a small part of the facilitations that artificial intelligence can provide and the combination of the three highlighted areas has not been mentioned.

If you have any ideas based on our article that could be useful for your business, feel free to contact us and let’s work together on the implementation of it!

Try our AI-based speech recognition application for free!

Speed ​​up your work with artificial intelligence! With the help of Alrite, you can easily create Hungarian transcriptions and video captions for dictated or previously recorded audio and video materials. The application offers the ability to store files, edit and share transcriptions and captions, and perform advanced search options.

More information