Effi­cient and pre­cise doc­u­ment search cap­ab­il­it­ies are not just prac­tical and con­veni­ent in today’s busi­ness world – they are essen­tial for data-depend­ent organisations.

We’re proud to announce that we’ve intro­duced Doc­u­ment Search in the Observation Deck 4.0 release, enabling users to inter­act with their data con­ver­sa­tion­ally with nat­ural lan­guage, as you can see below:

This innov­at­ive func­tion­al­ity integ­rates seam­lessly with our platform’s exist­ing cap­ab­il­it­ies, enhan­cing user inter­ac­tion by mak­ing it more intu­it­ive and effi­cient. Lever­aging AI, the sys­tem com­pre­hends quer­ies in nat­ural lan­guage, deliv­er­ing rel­ev­ant res­ults with cita­tions from doc­u­ments and data pre­vi­ously selec­ted and uploaded by the user. This enhance­ment rep­res­ents a sig­ni­fic­ant leap for­ward towards mak­ing inform­a­tion retrieval as effort­less as con­vers­ing with an assist­ant, stream­lin­ing the pro­cess of uncov­er­ing crit­ical data and insights hid­den in vast stores of documents.

Powered by Azure AI Search

The back­bone of our new Doc­u­ment Search fea­ture is Azure AI Search, a cloud-based search ser­vice enabling developers to swiftly set up and deploy soph­ist­ic­ated search exper­i­ences across web, mobile, and Azure enter­prise applications.

It integ­rates per­fectly with other Azure ser­vices, offer­ing a robust plat­form for devel­op­ing AI-powered search solu­tions. Unlike tra­di­tional search meth­ods, Azure AI Search uses arti­fi­cial intel­li­gence and machine learn­ing to under­stand the con­text and sig­ni­fic­ance of the data, allow­ing searches that focus more on the mean­ing behind the words than the words themselves.

The Ana­tomy of the AI Search Service

This search func­tion­al­ity requires three main ele­ments: Azure AI Search, a data con­tainer stor­ing the actual doc­u­ments and data, and the app from which the user quer­ies the data:

In addi­tion to this archi­tec­ture, we also decided to integ­rate a con­ver­sa­tional LLM model from OpenAI within the app func­tion­al­ity, fur­ther enhan­cing the effect­ive­ness and nat­ur­al­ness of the con­ver­sa­tion driven by Azure AI Search.

Search Ser­vice

Azure AI Search, formerly known as “Azure Cog­nit­ive Search”, is a power­ful search engine that sup­ports vec­tor, full text, and hybrid searches. It fea­tures rich index­ing with integ­rated data chunking and vec­tor­isa­tion, as well as a robust query syn­tax for pre­cise inform­a­tion retrieval. Flaw­lessly integ­rated with Azure’s extens­ive infra­struc­ture, Azure AI Search har­nesses the platform’s scalab­il­ity, secur­ity, and data man­age­ment ser­vices. Optim­ised by machine learn­ing and AI cap­ab­il­it­ies, it sup­ports semantic rank­ing and integ­rates with other Azure ser­vices for data inges­tion and AI enrich­ment, offer­ing a com­pre­hens­ive solu­tion for reli­able, fast inform­a­tion retrieval.

Within this ser­vice, an indexer con­nects to and retrieves data from the con­figured external data sources where users have stored their doc­u­ments and data; it then pro­cesses and loads this data into an index. Indexes are where the data is stored and organ­ised in a search­able format. Essen­tially, index­ers auto­mate the data inges­tion pro­cess, while indexes serve as the struc­tured data­bases that users query against to find rel­ev­ant information.

We can con­fig­ure index­ers thor­oughly to con­trol the fre­quency of data updates, to define cus­tom field map­pings between the source data and the index struc­ture, and to apply data trans­form­a­tions or enrich­ments dur­ing the index­ing pro­cess through cog­nit­ive skills. We con­figured the skill­set for our doc­u­ment search func­tion­al­ity to work with OpenAI embed­dings, mean­ing that texts are chunked and vec­tor­ised using an embed­ding model from OpenAI, in this case, Ada‑2.

Using an embed­ding model for doc­u­ment search offers a more nuanced under­stand­ing of the con­tent bey­ond simple keyword match­ing. It trans­lates doc­u­ments and quer­ies into vec­tor spaces, where the sys­tem meas­ures their semantic sim­il­ar­ity accur­ately. This approach enables the retrieval of doc­u­ments that are con­tex­tu­ally rel­ev­ant to the query, even if they don’t con­tain the exact query terms, lead­ing to more accur­ate and mean­ing­ful search results.

The Data Container

The data con­tainer is a secure stor­age area where users can upload and man­age their doc­u­ments and data. This cent­ral­ised repos­it­ory ensures that all neces­sary inform­a­tion is read­ily access­ible for index­ing and search­ing, as well as sup­port­ing a vari­ety of doc­u­ment formats (.pdf, .txt, .md, .csv) and main­tain­ing data integ­rity. As men­tioned before, it is primar­ily con­figured for our indexer skill­set, which optim­ises data to enhance the func­tion­al­ity and value of our search ser­vice chatbot.

We can add mul­tiple cog­nit­ive skills to our indexer to increase the value of the search. For example, by adding an OCR (Optical Char­ac­ter Recog­ni­tion) cog­nit­ive skill, we can also search graph­ical doc­u­ments. These skill­sets and other excit­ing fea­tures are explained in detail at this link.

The App

The app, a user-friendly inter­face within Observation Deck, enables users to con­duct searches, dis­cuss res­ults, and view addi­tional inform­a­tion such as actual cita­tions from their doc­u­ments. It’s designed for ease of use, allow­ing non-tech­nical users to effect­ively query their data using nat­ural lan­guage, to cus­tom­ise their search exper­i­ence, and to nav­ig­ate swiftly through res­ults to find the neces­sary information.

Screen­shot

Fur­ther Optimisations

Search­ing an index using Azure AI Search is com­posed of 2 exe­cu­tion layers:

  • Retrieval – Layer 1 (L1) quickly retrieves doc­u­ments using either keyword search, vec­tor search, or a hybrid method that com­bines both, pro­du­cing about the top 50 doc­u­ments to feed into the next layer.
  • Rank­ing – The second layer (L2) then refines these res­ults using deep learn­ing mod­els for semantic rank­ing, ensur­ing the top res­ults are most relevant.

This Microsoft blog post details a per­form­ance study on dif­fer­ent index searches, and con­cludes that the best method to get accur­ate res­ults is hybrid retrieval and semantic ranking.

Hybrid retrieval com­bines tra­di­tional keyword and vec­tor-based search to loc­ate rel­ev­ant doc­u­ments effi­ciently. At the same time, semantic rank­ing uses advanced lan­guage mod­els to refine and pri­or­it­ise these search res­ults by relevance.

This method lever­ages the pre­ci­sion of keyword search to cap­ture spe­cific terms, whilst vec­tor search semantic­ally aligns with the query’s intent, even across lan­guages. Tests across vari­ous cus­tomer and aca­demic data­sets con­firm that hybrid retrieval and semantic rank­ing sig­ni­fic­antly out­per­form other meth­ods, lead­ing to more accur­ate and valu­able res­ults for end-users, enhan­cing the rel­ev­ance of search res­ults and optim­ising the gen­er­at­ive AI’s per­form­ance by ground­ing it in the most con­tex­tu­ally appro­pri­ate content.

Con­clu­sions

Bene­fits of Azure AI Search

  • Semantic search: This search con­fig­ur­a­tion and set­ting, among oth­ers, provides res­ults closely aligned with the user’s search intent.
  • Com­pre­hens­ive index­ing: Cap­able of index­ing vari­ous doc­u­ment formats, facil­it­at­ing the loc­a­tion of inform­a­tion across dif­fer­ent data types.
  • Scalab­il­ity: Built to scale, ensur­ing it can handle grow­ing data volumes seamlessly.
  • Secur­ity and com­pli­ance: Azure’s robust secur­ity frame­work pro­tects sens­it­ive data while com­ply­ing with global regulations.

How Doc­u­ment Search Changes the Game

Integ­rat­ing Azure AI Search into Observation Deck trans­forms how busi­nesses access and use their data:

  • Faster decision-mak­ing: Reduces the time spent search­ing for inform­a­tion, mean­ing quicker, more informed decisions.
  • Dis­cov­ery of insights: AI-driven search reveals valu­able insights that might oth­er­wise be missed, enabling a deeper under­stand­ing of busi­ness and mar­ket dynamics.
  • Cus­tom­is­able searches: Users can cus­tom­ise their data, search set­tings and AI model set­tings to ensure that res­ults meet their spe­cific inform­a­tion needs and busi­ness objectives.

If you want to see how Azure AI Search can help your organ­isa­tion save time read­ing doc­u­ments and extract the value of your data, or to under­stand how Observation Deck can bring exec­ut­ive insights to your fin­ger­tips, simply con­tact us and our team of experts will be happy to help you!