Perform Rigorous Background Verification Checks Before Lending or Onboarding with KDiscover

Karza Technologies’ KDiscover application can enable banks and corporations to undergo rigorous background verification checks before onboarding customers and employees.

Published on 25 MAY 2022 | 3 mins read

In the past seven years, India lost at least Rs.100 crore daily to banking fraud. Consumers are subjected to a harrowing number of frauds every day because of conniving fraudsters. A majority of these frauds such as identity theft can be well averted if comprehensive background checks are conducted by banks.



During an investigation of a Mumbai office of the Employee Provident Fund Organization, the Central Bureau of Investigation discovered that an employee had siphoned off Rs.18.97 crore from a common PF pool through consistent withdrawals.


An employee of an IT firm based in Delhi refused to attend weekly meetings, when cases were at an all-time high and several waves swept through the city, citing a number of novel excuses, including the fear of contracting the disease or his vehicle breaking down. He was hired virtually.

After noticing that his official email was being used to send large files to other multinationals, his juniors alerted the management. An investigation revealed that he had taken up a new multinational job without quitting the previous one. 


Frauds of this kind happen on a daily basis. The vast majority of these frauds can be detected and avoided if candidates are subjected to a background check before starting work. As per reports collated by Indian publications, embellished resumes have increased from 30% to about 55%.

One of the major reasons to misrepresent oneself is to land a job and become financially stable. In certain organizations, time is of the essence when it comes to filling positions, so thorough fraud mitigation is not performed. A glance at an applicant's academic credentials and going through important documents such as PAN cards, voter cards, and Aadhaar cards suffice it.


In the past seven years, India lost at least Rs.100 crore daily to banking fraud. Consumers are subjected to a harrowing number of frauds every day because of conniving fraudsters. A majority of these frauds such as identity theft can be well averted if comprehensive background checks are conducted by banks. 


Karza Technologies’ KDiscover application can enable banks and corporations to undergo rigorous background verification checks before onboarding customers and employees.

What is KDiscover?

Karza Technologies' KDiscover application offers insights into customer profiles by screening application information based upon given information, recursively scanning public sources for additional information, and independently authenticating, and scoring applicants in order to trigger or alert the system in case of suspicious applicants. One of its kind Unified offering to risk-score an application for all KYC-related verification. 

KDiscover provides a tangible and definite risk score that is an unambiguous representation of the risk involved in onboarding a customer or employee. Not only are the complete employment histories retrieved from a huge number of sources, but we can also create related-party networks to gain insight into the employment history of parties related to the employee, e.g., the employee's uncle may be convicted of a financial crime while working for a different organization, which calls for an investigation into the employee's background. Also, the related parties network provides red flags and warning signals in case any of the parties are involved in fraud or crime. 

Finally, automated risk and fraud checks when customers apply for banking products such as loans, credit cards, etc., as well as when employees are onboarded and integrated into the company.

Use Cases of KDiscover 

Employment Verification Advanced 

The employer ought to verify during the onboarding process, the detailed employment history of the applicant, and whether the employee has the relevant exposure, work experience, and educational background to perform the range of duties characteristic of the position. Moreover, checks to determine whether the employee is really the person he claims to be. In other words, check comprehensively to determine the truthfulness of the information provided by the applicant during the hiring process. 

Risk Identification 

During the onboarding of customers, it is important that banks perform an exhaustive background check in order to determine the creditworthiness of the applicant, to verify any elaborate information provided by the applicant while registering as a customer, or when applying for banking products such as loans, credit cards, etc. Background checks are imperative in order to protect the bank's finances and reputation.



How KDiscover work?


KDisover is based on the synthesis and amalgamation of three elements: graph technology, AI, and analytics. Artificial intelligence is used for a variety of things, such as name matching, address matching, and face matching. Basically, all the checks and matches are required when an employee is onboarded or a customer is onboarded digitally. Interlinking several databases with graph technology allows us to verify the important constituents of identity, address, and employment. 


In essence, we don't just rely on the submitted information to calculate a score, but instead, enhance and augment that information by recursively searching various public sources for supplementary information. On top of that, we apply AI to crunch all of the data and give each applicant a score based on the merit of the data. 


Let us further delve into the elaborate layer of math or AI on the given data and sourced data to score the applicant. If we see at name-matching, we find that the name on the PAN card is matching with the submitted name with a name match score of 95%. From the Voter ID, the name match score arrives at 85%. The pressing question is how we arrive at a unique or a sort of consolidated match score that incorporates the various name match scores based on verification with different documents. 


We give varying weights to different sources depending on what has been retrieved and searched and what has been originally provided. For eg, let’s look at addresses. For addresses, you provide two documents i.e. your PAN card and Voter ID card. And if we find that the addresses are not matching as per PAN but matching as per Voter ID, it will show a match. As, when you were submitting your PAN card Number or details, you were not submitting as address proof, you were submitting your Voter ID card as address proof. Hence, different weights are given to the PAN card and VoterID card. So, if your address does not match the address on the Voter ID card, then that’s definitely a serious problem. 


Similarly based on the mobile number, you retrieve the power gas connection bill by recursively scanning through publicly available sources, and if the address mentioned in the bill does not match with the address submitted to the bank, there will be lower penalties as the address registered with, say, Bharat Gas is not actually the address proof. We have to encompass different scenarios which definitely makes the math convoluted and complicated. We have developed a rule-based engine covering different scenarios with a layer of math surrounding it. Hence, we arrive at a consolidated score that stands for an actual representation of the risk involved with onboarding a new customer or employee.   

What Challenges Karza Faced While Developing KDiscover and How Did We Overcome Them?


During the design and implementation of KDiscover, we faced many challenges. Here are some of them and how we dealt with them


Disparate and Unstructured datasets


The data contained in publicly available data sources are disorganized, scattered, disparate, and unstructured. Consequently, we need to clean up the data to remove unwanted debris and make it legible, edible, structured, and organized so that it can be used as a relevant dataset for obtaining valuable insights. We thus extract nuggets of information from a hotchpotch of scattered, intertwined data. This is the biggest challenge. 


Abysmal Data Quality


When you look at voter IDs and driving licenses, the photo quality is terrible, the pictures sometimes look like remnants of the childhood of a perfectly healthy adult. Photographs are usually of abysmally low resolution, making it hard to decipher faces. Other types of noise include blurring, tampering, and cuts here and there. The same goes for names and addresses.

Since we do not own the data, accessing historical data is unlikely. When we look at the final scoring engine, the lack of data may lead to an inadequate amount of data to arrive at a particular score, disbanding the entire process. We had to start with unsupervised techniques. Then work tediously with banks to assess performance and accuracy, examine boundary cases, and corner cases, and basically create signals and train models iteratively. We did not have the luxury of trained data sets for any of the problems we were tackling.

For most of them, we figured out a way to source it from publicly available databases. For some of them, it was sort of semi-automatic we mined a few things. We devised a rule-based engine to sort of approximate training datasets. 


We built proprietary datasets from the ground up in some cases after investing a great deal of time with clients in order to understand their needs and thereby the needs of the industry, thereby capturing some important signals. Signals indicate whether the model is just good or bad. We use them to approximate the training strategy for our models.   

Scale and Overall impact of the solution

Here is how KDiscover can be scaled without an iota of change to the basic IT infrastructure:

Let’s try to understand with an example, one of the banks started using KDiscover only on the sample cases referred to the risk containment unit (RCU). For example, there may be 100 personal loans or 100 credit cards being issued. They would cull out a sample of 10, and send it to the RCU unit. Typically, the RCU unit would send feet on the street to verify all of the data points. Now that they started using KDiscover, they could hardly believe the difference, which is the huge savings both financially and time-wise that compelled them to extend it to 100% of the applications rather than just allowing KDiscover to analyze random samples that represented a fraction of the total cases initially. 


Thus, from a scalability perspective, they would actually be able to scale 100% without having an iota of impact on the basic IT infrastructure. For them, it was simply a redirection. As a result, they could tangibly see that this particular utility was positively affecting their TAT costs, as well as increasing the amount of risk that they could probably absorb. Then they applied it to 100 percent of the applications, thus reducing the risk from rather random sampling to every application so that the entire spectrum of legitimate data could be verified and processed digitally immediately. 


They would rely on the RCU units to confirm that the processes that have been followed digitally are actually matching the physical pieces also. So for example, let’s say for employment verification, they send an email to the official email ID and wait for the applicant to click on the link or enter an OTP. the problem they would have is the delivery failure of the mail because the other side servers would not allow a link to infiltrate their systems. Or let’s say the borrower does not have access to the loan filing system or the online system where he or she has applied for the loan while they are at the employer’s office because of stringent IT restrictions. 


So, they may not be able to see the OTP on the mobile phone numbers or email addresses when they’re let’s say applying for a loan. There were innumerable glitches in the legacy processes. So they could succinctly see that they can verify the employment of any applicant without sending any link, whatsoever. That’s where basically, the impact came in and hence they extended it to 100% of the cases.



How Did Two of The Top 10 Banks in India Leverage KDiscover?

The bank took around 15-20 minutes to retrieve and analyze one applicant's background verification. Due to the enormous amount of applications they used to have, it took an extraordinarily long time with egregious errors typical of manual verification. Their plan would be to open up various websites, access a variety of sources, and check if there is a name mismatch based on multiple sources. With regard to employment, they will check whether the applicant's name appears on the EPF website in order to determine whether he was actually employed previously and also cross-verify the information he provided. 

They checked whether the official email address is active or not. Through following the entire process, they will arrive at an ambiguous, obscure idea of a person's genuineness, which leaves a great deal to be desired. Due to the overbearing number of checks, there had to be performed, and due to the monotony and fatigue of the entire process, some red flags would have been overlooked, and thus fraudsters and miscreants would be onboarded with malicious intent. 

By leveraging KDiscover, the entire process is truncated to 30-40 seconds with the full list of checks performed across the board. Further, instead of the gross over-simplification of categorizing cases into two broad departments based on the action required, e.g., accept or reject. There are three categories: accept, reject, and an amber category where we state that we cannot validate the provided information, and therefore need to access additional data points in order to arrive at a clear decision on whether to accept or reject. While accept and reject cases facilitate straight-through processing for clients, amber cases require manual intervention. 

If you look at a bank, there are white-collar jobs occupying the higher echelons of management. But there is an umpteen number of clerical jobs which have a very high attrition rate. Inflows and outflows of workers are prodigious.

In the same way, One of the top banks in India uses KDiscover to screen such applicants to ensure they are genuine and that, in cases of previous employment, the employee was actually employed by the company he claims to work for. Here, more weightage is given to employment verification as mentioned above, while some attention is given to identity and address as well.  


Through Karza Technologies' KDiscover, loan processing, employee onboarding, which traditionally took hours or days is shortened to minutes or seconds. In addition to more clients or employees being onboarded, a lot of red flags are overlooked or ignored due to the high attrition rate and the employer's need to fill positions as quickly and as possible. These flags are raised and flagged for scrutiny or manual intervention. The failure to pay attention to such red flags can result in humongous losses that cannot be recovered.

Karza Technologies is acquired by Perfios Software Solutions