The Semantic Router: AI’s Path­way To Under­stand­ing User Input



Here at the synvert Observation Deck team we are con­tinu­ally innov­at­ing and enhan­cing our chat­bot to provide our cus­tom­ers with the best pos­sible exper­i­ence. One of the key chal­lenges we face is to accur­ately under­stand and inter­pret user requests to ensure that our responses are both rel­ev­ant and helpful.

To tackle this, we are excited to intro­duce a new fea­ture: the semantic router. This advanced soft­ware mod­ule enables our chat­bot to identify exactly what you, the user, are try­ing to achieve. By lever­aging mod­ern sim­il­ar­ity search algorithms, the semantic router ana­lyses your quer­ies, com­pre­hends their con­text and intent, and dir­ects your request to the appro­pri­ate func­tion, thus guar­an­tee­ing the most accur­ate and effi­cient ser­vice pos­sible, mak­ing your inter­ac­tions with the chat­bot smoother and more intuitive.

In addi­tion to enhan­cing accur­acy and rel­ev­ance, the primary goal of the semantic router is to replace the need for a prompt or an AI agent to manu­ally select which down­stream task to per­form. Ini­tially, we approached this prob­lem using prompts, but there were issues with latency and rising costs.

The semantic router effect­ively over­comes these prob­lems by auto­mat­ing the task selec­tion pro­cess, deliv­er­ing faster responses at lower oper­a­tional costs. This enhance­ment not only optim­ises per­form­ance but also stream­lines the exper­i­ence, mak­ing inter­ac­tions more effi­cient as well as more cost-effective.

The Power of AI Embeddings
Fig­ure 1: The Power of AI Embeddings

Clas­si­fy­ing user input accur­ately within its spe­cific con­text is far from simple. Tra­di­tional rout­ing meth­ods typ­ic­ally rely on con­di­tional branch­ing to choose a pre­defined path, which can be lim­it­ing in scope.

While this approach works well when the route is known in advance and can be expli­citly pro­grammed, the dynamic and com­plex land­scape of today’s AI demands more soph­ist­ic­ated tech­niques to pro­cess vast and var­ied real-world data.

Mod­ern AI sys­tems util­ise soph­ist­ic­ated meth­ods to trans­form this data into math­em­at­ical expres­sions, which can be ana­lysed to identify key pat­terns and fea­tures. These math­em­at­ical rep­res­ent­a­tions, known as embed­dings, con­vert tex­tual and other data types into dense, mul­ti­di­men­sional vec­tors. This allows for a richer and more nuanced under­stand­ing of the data, enhan­cing fur­ther pro­cessing and ana­lysis. Through embed­dings, AI sys­tems gain the abil­ity to inter­pret and clas­sify inputs more accur­ately, pav­ing the way for the devel­op­ment of more respons­ive and intel­li­gent applications.

As we men­tioned pre­vi­ously, our semantic router relies heav­ily on gen­er­at­ing embed­dings for the text that needs to be clas­si­fied. To gen­er­ate these embed­dings, we use spe­cial­ised mod­els designed to pro­cess large volumes of text data. These mod­els learn and encode the semantic rela­tion­ships between words and phrases into numer­ical vec­tors, allow­ing the semantic router to cat­egor­ise and route text effect­ively based on its under­ly­ing meaning.

Embed­ding mod­els are trained through a soph­ist­ic­ated pro­cess involving sev­eral key steps. Ini­tially, a large cor­pus of text is token­ised into man­age­able units, like words or monemes. The next step is learn­ing from the con­text in which these tokens appear. Two com­mon meth­ods are the Con­tinu­ous Bag of Words (CBOW) and the Skip-Gram mod­els. CBOW pre­dicts a tar­get word based on sur­round­ing con­text words, whereas Skip-Gram works in reverse, pre­dict­ing the con­text words from a tar­get word.

Ini­tially, each word is assigned a ran­dom vec­tor, which the model refines iter­at­ively by optim­ising a loss func­tion that meas­ures pre­dic­tion accur­acy. Neg­at­ive sampling is often employed dur­ing this pro­cess, help­ing the model dis­tin­guish between actual con­text words and ran­domly selec­ted ones. As the model trains, it cap­tures semantic rela­tion­ships by map­ping words into a high-dimen­sional space, where words with sim­ilar mean­ings are rep­res­en­ted by vec­tors that are close to each other. For example: vec(“Spain”) + vec(“capital”) is close to vec(“Madrid”), just as vec(“blue”) + vec(“planet”) is close to vec(“Earth”). Below we can see another example, where sim­ilar con­cepts, like “sil­ver” and “gold”, or “bull” and “mam­mal,” are posi­tioned close to each other:

example of words assigned to vectors. citric and orange, silver and gold, bull and mammal build pairs.

Some of the most prom­in­ent embed­ding mod­els include Word2Vec and BERT (Bid­irec­tional Encoder Rep­res­ent­a­tions from Trans­formers), both developed by Google, and GloVe (Global Vec­tors for Word Rep­res­ent­a­tion) from Stan­ford. Each of these mod­els has a unique approach to cap­tur­ing and rep­res­ent­ing semantic information.

For­tu­nately, we don’t have to handle the com­plex­it­ies of train­ing and main­tain­ing these embed­ding mod­els ourselves. Instead, we can har­ness ser­vices like Microsoft Azure OpenAI, which provide access to state-of-the-art pre-trained mod­els. By using Azure OpenAI, we can query these advanced embed­dings mod­els dir­ectly with the text we want to convert.

This approach not only saves us the com­pu­ta­tional over­head and resources required for train­ing large mod­els, but also ensures that we bene­fit from the latest advance­ments in AI tech­no­logy. These pre-trained mod­els allow us to focus on apply­ing the embed­dings to enhance our semantic router’s cap­ab­il­it­ies, mak­ing it easier to clas­sify and pro­cess text with greater accur­acy and efficiency.

The Router Index

Now that we have the embed­dings for our tar­get text, the next step is to com­pare these embed­dings against a ref­er­ence set, known as an index, which rep­res­ents the poten­tial cat­egor­ies or top­ics for classification.

An index is essen­tially a col­lec­tion of pre-com­puted embed­dings that cor­res­pond to a pre­defined set of top­ics or cat­egor­ies. To build this index, we first gen­er­ate embed­dings for a com­pre­hens­ive set of texts that cover the full range of top­ics for clas­si­fic­a­tion. These embed­dings are then organ­ised into a struc­tured format, typ­ic­ally a large mat­rix. Each row of this mat­rix rep­res­ents the embed­ding vec­tor of one ref­er­ence text, mak­ing the matrix’s dimen­sions equal to the num­ber of ref­er­ence vec­tors by the dimen­sion of each vec­tor. Since all embed­dings are gen­er­ated by the same model, they share the same vec­tor dimen­sions, ensur­ing con­sist­ency in the com­par­ison process.

matrix with dimensions equal to the number of reference vectors by the dimension of each vector

The pro­cess of cre­at­ing this index involves aggreg­at­ing embed­dings from vari­ous texts into the mat­rix, allow­ing us to effi­ciently com­pare new embed­dings against it. How­ever, to ensure accur­ate and mean­ing­ful match­ing, any new embed­ding vec­tor used for com­par­ison must be gen­er­ated using the same embed­ding model or a com­pat­ible one. This con­sist­ency is cru­cial as embed­dings from dif­fer­ent mod­els or ver­sions may not align prop­erly in the vec­tor space, lead­ing to unre­li­able results.

Once the index has been built, we can pro­ceed with the match­ing pro­cess: com­par­ing the embed­ding of the user’s input text with the embed­dings stored in the index. By identi­fy­ing the nearest embed­dings in the index to the user input embed­ding, we can determ­ine the most rel­ev­ant cat­egory or topic for the input text.

Sim­il­ar­ity Search: Unlock­ing the Power of Data Matching

Now that both our index and the user input embed­dings are ready, the next step is to build a sim­il­ar­ity mat­rix – a math­em­at­ical tool used to quantify how sim­ilar dif­fer­ent vec­tors are to one another. In our case, the vec­tors rep­res­ent embed­dings, which are numer­ical rep­res­ent­a­tions of texts. The sim­il­ar­ity mat­rix will meas­ure how closely the user input embed­ding matches each of the embed­dings in our index.

One of the most com­mon meth­ods used to cal­cu­late this is cosine sim­il­ar­ity, which meas­ures the cosine of the angle between two vec­tors in a multi-dimen­sional space. The for­mula for cosine sim­il­ar­ity between two vec­tors, A and B, is the dot product of A and B, divided by the product of their mag­nitudes. This res­ults in a value between ‑1 and 1, where 1 indic­ates the vec­tors are identical in dir­ec­tion, 0 means they are ortho­gonal (unre­lated), and ‑1 means they are dia­met­ric­ally opposed.

To build the sim­il­ar­ity mat­rix, we per­form the fol­low­ing steps:

  1. Com­pute the norms (mag­nitudes) of the user input embed­ding and each embed­ding in the index.
  2. Cal­cu­late the dot product between the user input embed­ding and each embed­ding in the index.
  3. Divide each dot product by the product of the cor­res­pond­ing norms to get the cosine sim­il­ar­ity scores.

Since we are com­par­ing the user input embed­ding with each embed­ding in the index, our sim­il­ar­ity mat­rix is actu­ally a vec­tor. This vec­tor con­tains sim­il­ar­ity scores, each rep­res­ent­ing how sim­ilar the user input is to a par­tic­u­lar topic in the index.

With our sim­il­ar­ity mat­rix in place, we can sort these scores in des­cend­ing order, identi­fy­ing the top­ics with the highest sim­il­ar­ity to the user input. The higher the score, the more rel­ev­ant the topic is to the user input.

And that’s it! With this fully func­tional semantic router, user inputs are now effect­ively routed to spe­cific task-hand­ling mech­an­isms, enabling the chat­bot to bet­ter under­stand and respond to user requests and also to provide rel­ev­ant and pre­cise responses, improv­ing its abil­ity to assist users effectively.

user inputs effectively routed to specific task-handling mechanisms

To fur­ther illus­trate how the semantic router works, look at the pic­ture above. We manu­ally cre­ate spe­cific utter­ances for each of the routes it man­ages. These utter­ances rep­res­ent the dif­fer­ent types of user requests we anti­cip­ate, and they are stored together in the index. This allows the semantic router to quickly com­pare incom­ing quer­ies with pre-defined utter­ances, ensur­ing that it can accur­ately match user intent with the cor­rect func­tion. By effi­ciently organ­ising these utter­ances in the index, we sig­ni­fic­antly improve both the speed and pre­ci­sion of the chatbot’s responses.

Enhan­cing Rout­ing with Sim­il­ar­ity Techniques

Now that we have a fully func­tional semantic router guid­ing actions based on user input, there are sev­eral key factors to bear in mind to ensure excel­lent ser­vice delivery.

Firstly, user input isn’t always clear or pre­cise. People often express their inten­tions using vague, incom­plete, or col­lo­quial lan­guage, which can make it dif­fi­cult for the semantic router to accur­ately inter­pret their intent. For instance, abbre­vi­ations or slang may con­fuse the sys­tem, lead­ing to inac­cur­ate rout­ing. To address this, it is essen­tial to incor­por­ate a robust fall­back mech­an­ism, cap­able of hand­ling cases where the user’s inten­tion is unclear and either guid­ing them to cla­rify their input or default­ing to a gen­eral action that can accom­mod­ate the ambiguity.

Secondly, semantic rout­ing relies heav­ily on inter­pret­ing the mean­ings of words and phrases to determ­ine the appro­pri­ate action, but dif­fer­ent actions may share sim­ilar semantics, espe­cially in spe­cial­ised or focused applic­a­tions. For example, in a cus­tomer ser­vice set­ting, the dis­tinc­tion between a request for a “refund” and a “return” can be subtle and con­text-spe­cific. This semantic sim­il­ar­ity can pose a chal­lenge for the router, as it may not always be able to dis­tin­guish between such closely related actions. So, while design­ing the semantic router, it is cru­cial to imple­ment mech­an­isms that can handle such nuances, pos­sibly by incor­por­at­ing addi­tional con­tex­tual inform­a­tion or by design­ing the sys­tem to prompt users for fur­ther cla­ri­fic­a­tion when sim­ilar semantic routes are detected.

User asking chat bot what oil he did produce in 2020. The chat bot offers two possible options to choose from.

As shown in the image above, for ambigu­ous ques­tions (where the user may be ask­ing “What type of oil did I pro­duce in 2020?” or “What was the total volume of oil pro­duced in 2020?”), the chat­bot offers a choice, prompt­ing the user to cla­rify which action (or route) to take.

This hap­pens because the scores for the two routes, obtained after the sim­il­ar­ity search, were too close. As a res­ult, the semantic router is unable to determ­ine what the user wants to do. To cal­cu­late the scores in the Observation Deck chat­bot, we per­form the fol­low­ing steps:

  • We use a sim­il­ar­ity search func­tion to find the closest matches between the user’s input and the pre-defined utter­ances in the index.
  • For each poten­tial route, we sum the sim­il­ar­ity scores for the rel­ev­ant matches and count how many matches align with that route.
  • We then nor­m­al­ise the score by factor­ing in both the total sim­il­ar­ity score and the num­ber of matches for each route.
  • Finally, we sort the routes in des­cend­ing order based on their nor­m­al­ised scores, enabling the semantic router to identify the most rel­ev­ant action.

This pro­cess ensures that our chat­bot iden­ti­fies the most appro­pri­ate response based on semantic sim­il­ar­ity, bal­an­cing both the qual­ity and quant­ity of match­ing utterances.

In con­clu­sion, while a semantic router is a power­ful tool for guid­ing actions based on user input, there are cer­tain con­sid­er­a­tions that must be addressed to ensure its optimal effect­ive­ness: imple­ment­ing robust fall­back mech­an­isms, hand­ling the chal­lenges of semantic sim­il­ar­ity, refin­ing the semantic mod­els con­tinu­ally, and train­ing users in effect­ive interaction.

The semantic router offers sig­ni­fic­ant advant­ages in terms of speed and cost-effect­ive­ness, but it does have lim­it­a­tions when it comes to scalab­il­ity, par­tic­u­larly if vec­tor stores are dynamic and require the gen­er­a­tion of new utter­ances each time the store is cre­ated. By address­ing these issues, we can enhance the reli­ab­il­ity and accur­acy of the semantic router, provid­ing a faster, more effi­cient, and user-friendly service.

If you’re look­ing to imple­ment or optim­ise a semantic rout­ing solu­tion for your busi­ness, synvert can help. Our team of experts spe­cial­ises in deliv­er­ing advanced AI-driven sys­tems tailored to your organisation’s needs, so get in touch with us today to explore how we can help you.