Data analysis vs data categorization: what’s the difference?

Data analysis vs data categorization: what's the difference?

Data analysis vs data categorization: what’s the difference?

A common understanding of ‘data’ is essential before progressing to the data analysis vs data categorization prior to joining the discussion to understand their differences.

Data is defined as “factual information (such as measurements or statistics) used as a basis for reasoning, discussion, or calculation,” by the Merriam-Webster dictionary.

Data is also understood as pieces of information or content that is stored in that particular manner for a specific purpose. Any piece of information that can be placed in a context and leveraged for extracting some future application can be considered as data. Numbers are data, as are characters of the alphabet arranged in sequences. Images of enemy territory are data, as is a video of an engineer following a sequence on the shopfloor.

Though data is also defined as ‘units of information,’ the terms ‘data’ and ‘information are often used interchangeably. However, there are many who use them to mean different things. In general, data is just data, a lot of information that may not have much meaning or use for anyone. When it starts being placed in a context or reviewed with the objective of application, it starts being information that is of use to the owner of that data.


The oWorkers advantage

With its talented pool of resources who have a deep understanding of data, oWorkers retains the ability to go beyond the data analysis vs data categorization discussion and delve deeper into all nature of work that require working with data. We have been awarded as one of the three best data services providers in the world on multiple occasions.

The rich talent pool oWorkers has access to is a result of our relationship which emanates from the deep engagement with the communities we work in. This includes a steady stream of walk-in applicants interested in a job with oWorkers which gives us a choice of talent for all our projects. Whatever the aspect of data they need to work on, our training team is equipped to polish them to deliver in the target process.

A related benefit of access to a continuous talent pool is the ability to provide for short-term ramps in client volumes arising out of planned, expected or unexpected events. Our deep supply pool enables us to meet these short-term requirements. This is a huge cost saving for clients who would normally need to hire resources on a long-term basis despite the work requirement being only for a few days or weeks in the year.


Data analysis vs data categorization: what they mean

Let us look at what data analysis and data categorization mean.

What is Data Analysis?

Data analysis can be understood as the process of making sense of information that is available, with a view to gaining knowledge and understanding about it as well as the underlying variables that have created that data, with a view to applying the learnings for the benefit of the company or individual doing the analysis.

Sounds complex, does it?

When something is put across as a formal statement, it might look daunting at first. However, if you think for a moment, data analysis is a natural process that we do all the time in our personal lives.

When you go to a doctor with an ailment, what does she do? She will look at and understand the symptoms, perhaps ask some questions, correlate the information with your past medical history as well as ailments that might be common at that time of the year and prescribe a cure. What she is doing is data analysis.

When a youngster is exploring courses and universities so that he can make up his mind on which ones to apply to, what is he doing? Isn’t he doing analysis of data? He will look for information on which universities are offering a course of his interest. He will check out their intake criteria and if he will be eligible. He will probably also check the financial requirements and establish which ones he will be in a position to afford. He is analysing the many different pieces of data that will facilitate his decision.

When I am playing tennis, I look at my opponent’s position, try to project his likely movement, and then play a shot with the intention that it is either a winning shot or is a build-up to the winning shot. What did I just do? I analyzed data.

Of course, all these elements of data analysis are perhaps one subliminally, without being called data analysis.

The same process becomes more formal when it is done in a formal setting, like that of a company, and is called data analysis.

What is data categorization?

In the modern enterprise, data is critical. This is not to suggest that data was not critical in the pre-modern enterprise, but with the growth of population and consumer franchises of global corporations, the generation and collection of data has assumed mammoth proportions. Besides, as the world has become increasingly competitive, with democratic, free-market societies becoming the norm, corporations would like to leverage every bit of data at their command to eke out an advantage over their rivals and assume dominance in the marketplace.

Generation of huge volumes of data creates a need for storing it in a manner that it can be accessed by the people and teams who need it for their requirements and are authorized to do so. Categorization of data becomes essential for its future application and useability. Categorization enables data to be stored in a manner where items to be retrieved for a particular requirement can be identified and, hence, retrieved. An organization could drown in the mass of data it has generated if for every single requirement it has to go through the entire data it has collected.

Data categorization could be defined as the process of collecting, sorting and storing data in a manner that will enable easy retrieval when needed as well as access for retrieval, editing and deleting only to a defined set of personnel, or positions, based on the policy of the company.

The oWorkers advantage

By employing the staff needed for its projects, oWorkers creates permanence in project delivery, as opposed to some competitors who choose to rely on freelancers or contract staff. In this context, data analysis vs data categorization ceases to be relevant as we can handle either with equal aplomb, the result of two-way trust built between the employer and staff. The staff trust the employer to monitor and manage career progression while the employer expects staff members to deliver their best on client contracts.


Data analysis vs data categorization: their purpose

Data analysis

Data analysis is a key input process for business leaders. They expect the collective wisdom of past experiences to be distilled out and used as the bedrock for future decision-making and direction.

It can provide key insights about customer behavior, the reason for the existence of the business. The business not only gets information on buying behavior, but also a wholesome view including what the customers are saying about you.

It serves to measure the efficacy of initiatives like marketing and promotional programs; which ones are working and which ones are not, and take decisions on the fly.

It can even serve as a barometer for internal evaluations based on key metrics of the business, either for teams and departments, or for individuals.

In short, data analysis serves as a key input for managerial decision-making.

Data categorization

The main aim of data categorization is arranging data for easy access to authorized users. The stored data becomes easy to understand once categorized and improves its utility manifold. It also serves as an input for data analysis as it would be impossible to analyse raw data that is unstructured and undifferentiated. The exercise also serves as a validation of the data by ensuring that it fits into one of the expected categories, based on characteristics of each individual piece.

Data categorization also serves the purpose of regulatory compliance. Many jurisdictions have laws pertaining to storage, searchability and retrievability of data.

With several unicorn marketplaces as longtime clients, oWorkers understand the challenges of this work and is equipped to handle them. With centers in three of the most sought-after delivery locations in the world, oWorkers employs a multicultural team which enables it to offer services in 22 languages.

Our leadership team comes with hands-on experience of over 20 years in the industry. They lead the company on its various projects and ensure client requirements are fulfilled. Through an Internal Quality (IQ) team that serves as their eyes and ears, they stay abreast of developments on the shop floor and are able to intervene when the requirement arises.

The IQ team also leads improvement initiatives and keeps a check on output to ensure the client does not receive sub-par quality. They monitor transactions and provide feedback and inputs to the operating units.


Data analysis vs data categorization: how they are done

Data analysis

It is difficult to put a boundary around how data can be analysed. There are many different ways of looking at data analysis methods. At the highest level, one school of thought is to break data down into qualitative and quantitative sets.

Qualitative data, as the name suggests, is data that is anecdotal, like visitors at an exhibition showing interest in one product and not the other, or textual, like comments on a feedback form, or verbal, which is like an unwritten version of textual data. As it cannot be numerically measured, there is some amount of subjectivity in drawing conclusions from it. Though some people are chary of handling unstructured data, as conclusions can be questioned, it is an important source of information for decision-making.

Quantitative data, on the other hand, is often numbers, or at least surrogate numbers that can be processed through standard mathematical or statistical techniques. While average and deviation might be the most common, a host of other techniques like Conjoint Analysis, Cluster Analysis, Regression, Factor, Time Series and Cohort Analysis and many others come into play.

Of course, it is not a simple exercise of finding a tool and passing data through it. The analysis is contextual and needs to be done keeping in focus the objective of the organization as well as what it expects from that analysis.

Data categorization

Categorization of data can be done on a variety of parameters.

Traditional data categorization systems were driven more by the need for securing data based on its sensitivity. Data tended to be placed into categories such as ‘Restricted,’ ‘Confidential,’ ‘Classified,’, ‘Public,’ ‘Sensitive’ or ‘Private.’

Thought processes have evolved. The preference now is towards multidimensional tagging that can categorize a particular piece of data on multiple parameters at the same time. Of course, data storage in digital formats also facilitates multi-dimensional categorization that could be affixed as tags. Once this is done, the level of availability of data and access rights can be determined.

Some common types used by complex organizations:

  • Based on value
  • Based on usefulness timeframe
  • Based on information type
  • Based on who it pertains to – clients, employees, etc.
  • Based on requirements to refresh
  • Based on retrieval rights

The oWorkers advantage

oWorkers is GDPR compliant, ISO (27001:2013 & 9001:2015) certified and operates from super secure facilities in each of its three delivery locations. oWorkers has also emerged stronger from the global emergency created by the Covid-19 pandemic. We have been amongst the first to create an environment for their staff to work from the safety of home in times of the epidemic, as and when required. With our technology, all staff can operate fully either from home or office, as dictated by the unfolding situation. In addition to trained human resources, oWorkers is able to access the latest technology tools suitable for this activity, thanks to its enduring partnership with leading providers of technology. Data analysis vs data categorization ceases to be relevant when we can operate with equal facility on both.

Engaging a partner for data categorization within an enterprise

Engaging a partner for data categorization within an enterprise

Engaging a partner for data categorization within an enterprise

Data is critical to the modern business enterprise. That is not to say that it has not been important in the past, but, in a free market, as the level of competition keeps rising in successful and sunrise sectors, the need for leveraging every bit of competitive advantage becomes more important to survival and success.

The business enterprise collects and stores large volumes of data. Every decision maker in the company ought to know and understand the data that is available with the company so that it can be put to the right use as and when the time comes.

This is the starting point for data categorization within an enterprise.

There could be regulatory reasons too. A new regulation designed to protect privacy of individuals may mandate enterprises to delete the Social Security number of all customers that they have so far been collecting. Even though regulatory changes cannot be anticipated, the systems that house data need to have the adaptability to respond in reasonable ways to them.

‘Data categorization’ is often used interchangeably with ‘data classification.’ Then there are others who seek to make a distinction between the two, using ‘data classification’ to refer to the overarching strategy for data which leads to the slotting of different pieces of data into ‘data categories.’ For the purpose of this discussion, we will consistently use the ‘data categorization’ to refer to all aspects of the activity.


Principles for effective data categorization within an enterprise

There is always a starting point for all efforts of an enterprise. In addition, there can be many reconsideration and review points. An existing policy or decision can be changed at any point of time if it is not working for the benefit of the business.

Business need should be paramount

Since this discussion is in the context of a business, or enterprise, it stands to reason that the need of the business overrides other considerations. While it may sound obvious to some, for others this may need articulation.

It, perhaps, also stands to reason that the people most conversant with setting the direction of the business should be involved, since one of the key goals of the exercise is to ensure availability of data for taking business decisions in future.

There will be many partners in the effort, including the IT team, since the modern enterprise relies on technology to process and store information. However, the direction should be provided by the business leaders.

Creation of a policy

When a company is small, it is possible to spread messages easily across the small, committed, start-up teams involved at that stage. However, once it acquires enterprise scale, informal communication channels cease to be effective and formal channels need to be introduced. This usually takes the shape of policies that are created and made known to all or impacted constituents so that they may be able to comply with the expectations of the leadership.

Data categorization within an enterprise should be the subject of one or more policies that lay out the various aspects of the subject for the knowledge and compliance of the larger team.

Who should have access to the hush-hush competitor study done with the help of a consultant?

Should the marketing strategy of the company be circulated company-wide?

How can it be ensured that staff members can access the information they need for carrying out their day-to-day activities?

These might be some of the questions the policy needs to address, but, once in place, should make it easier to operate.

Categorization of data

Defining the categories that are relevant for the enterprise will, of course, remain the key focus of the exercise. There are many ways, or parameters, or axes, on which data can be categorized. We are no longer limited to a two or even a three-dimensional mapping of each item. Thanks to modern computing systems, each item can be placed in multiple categories at the same time, while remaining in its place. Thus, data can be classified based on:

  • Sensitivity – This is perhaps the most basic requirement, and also one of the objectives. It will depend on the need of the organization to secure the information.
  • Business area – Relevant for diversified enterprises in multiple business or product lines.
  • Function – Is HR the owner of the data or is it Sales?
  • Constituency – Whether it pertains to clients, employees, vendors, etc.

Of course, there can be many others, based on the unique requirement of each company doing it.

Access levels and rights

The placement of data into categories will also lead to defining who has access to what information. Access to employee data, for example, being the personal information of staff, may only be permitted to selected staff in the HT team, while the unaudited financials may only be available to the Finance team. At the same time, once the Balance Sheet has been published, the contents of the document would then become public and open to all.

The IT systems of the enterprise play the role of an ally in data categorization within an enterprise.

Employee personal information may be permitted to the HR team for viewing but information about their compensation may be restricted to the few members of the Compensation and Benefits team.

Though viewing may be permitted, changes to the information may only be allowed with the authorization of the HR Head.

There are many such permutations and combinations that become possible thanks to the technology that is available to everyone. However, the company needs to be clear on its policies regarding data management. All else will follow from there.


Should a partner be engaged to support the process?

Finally, data categorization within an enterprise needs to be done. It will not get done merely by having conceptualized a need for it. Enterprises are busy hives of activity, with moving the many parts that need to be moved to ensure the success of the company.

Dedicating resources to this task, or even getting people to work part-time on this requirement are possibilities that may work in some places, but many organizations have found it worthwhile to engage an enabling partner. This approach has the following advantages:

  • Allows staff to focus on their jobs, and ensuring the business can carry on uninterrupted
  • Brings many more ideas on the table that may be beneficial, while the management still retains the right to veto or overrode any of them

Should you go down this path, here are some pointers that may be useful when you select a partner for this work. Each outsourcer needs to find a partner that will add value. The perception of value could differ from one to another, as will its provision by different potential partners. A holistic view, in the best interest of the enterprise, will eventually be needed.


Criteria for partner selection

Prior experience

For most business processes, quality and accuracy are non-negotiable. Speed and Pricing are important considerations, but not at the cost of accuracy. An incorrect entry on a GIS system could send a traveler in the wrong direction, or lead to the abortion of a rocket lifting off from Cape Canaveral.

oWorkers, has a track record of 8 years in the business, having served multiple global clients. We have repeatedly delivered over 99% levels of accuracy across engagements, despite the differences in scales and measurement systems and criteria used by different clients. Feedback of existing clients as well as performance data are both available for verification.


This refers to the turnaround time or the speed at which work is handled. With ‘just in time’ gaining currency with companies, wherever possible, it is expected that vendors should be able to handle assigned work with great efficiency, reducing the turnaround time to a minimum. This, in turn, enables the principle to operate with greater efficiency and produce better results in its business.

All centers of oWorkers are fully equipped to handle client operations on a 24×7 basis. This ensures that work is handled on the fastest basis. As our clients come from different parts of the world, very often the difference in time zones alone is adequate for overnight delivery of work. If that cannot serve the client requirements, then the 24×7 operation will.


This is the third corner of the operations triangle, apart from quality and efficiency. It is also a necessary part of all commercial engagements; the consideration for providing products or services.

The risk, often, is that undue weightage can be given to pricing during the evaluation process. This apparently happens because pricing is a number, and transparent, while other criteria may be subjective and hence could be called into question based on different beliefs. Hence, a lower price is considered to be a safe recommendation while a higher price, with subjective justification of other reasons, is considered to be unsafe, as the decision could be questioned by higher authorities and motives questioned.

A mature company will make the effort to ensure that pricing is only one of the criteria, and not necessarily the most important one. 

A transparent mechanism of pricing, that offers clients a choice between a dollars per hour rate and a dollars per unit of output rate, along with committed service standards and SLAs, enables oWorkers to satisfy all client requirements. Clients note savings of almost 80% after outsourcing to oWorkers, compared to their inhouse costs. This is true for most clients from the US and Western Europe.


This is the backbone on which services have become global, whether it is for data categorization within an enterprise or any other requirement. This is the reason the BPO industry has also been known as an ITES, or Information Technology Enables Services. Having the right technology for the job is often the difference between success and failure.

oWorkers has forged a wide set of partnerships with owners of technologies. This positions us to make use of the latest versions of their technology, depending on the requirements of the project. Our clients gain from this arrangement as the use of these advanced technologies, eventually, is for their work.

Business continuity

While mankind continues to pursue a relentless agenda of development and progress, there are many variables that are not in control that keep impacting lives and businesses around the globe. Political strife, violence and bloodshed can cause havoc, as can freak weather events attributed to global warming that are causing greater and greater damage. Or, the Covid-19 pandemic that has swept through the world like a wildfire. When such events happen, the ability of a business to function can be impaired. However, business continuity options, if available, can be a great benefit, as they enable companies to operate despite the circumstances, and can differentiate them from competitors while giving them an aura of permanence and reliability.

oWorkers is well positioned to offer business continuity to clients in the event the primary delivery site is affected on account of local issues, from any of the three global delivery locations it operates from. In addition, we have been among the first to implement work from home solutions for our staff. Today, we are able to operate at full capacity either from the workplace or from home, depending on the unfolding Covid-19 situation.

Secure environment

Data is a valuable currency for companies and one they want to protect from misuse. It is imperative for a vendor to demonstrate adequate measures to protect client data, especially since they may be doing similar work for other clients, which is what gives them the advantage of experience in that field. Watertight separation of digital spaces for processing is a must, as is physical segregation through access control technologies.

oWorkers is GDPR compliant, as well as ISO/IEC 27001 certified, and committed to following best practices in technology and data security.

MultiLingual capability

As the world shrinks and businesses become global, the ability to understand multiple languages becomes important in keeping the plates spinning. Also, one does not want to seek out a new provider each time the business grows to another geography with another new language. It could become a deterrent for business growth. What one needs is a partner who is able to handle the language requirements that may come up as a result of growth; in other words, a partner who will enable, not hinder, growth.

Along with its presence in three distinct geographical regions of the globe, oWorkers actively practices multicultural and multi-ethnic teams because of which it now possesses the ability to support clients in 22 languages.

Access to resources

The nature of the business is such that it requires hiring to be done almost continuously because many employees drop by the wayside fairly early in the game. Also useful is the ability to adjust hiring volumes to match peaks and troughs in client volumes.

The ability to hire requires presence in the local ecosystem and acceptance as a contributing member of the community, that oWorkers has established in ample measure in all its delivery locations. As a result, we get a steady stream of walk-in applicants that enables us to hire resources based on their suitability for different projects. In addition, it gives us the flexibility to staff for peaks at short notice, without asking clients to bear the cost of these resources for the rest of the year.

Internal Quality

A unique feature of BPOs has been the reliance placed on a team that is external to the delivery team, yet a part of the organization, that keeps tabs on the performance of operations. This is known as the Internal Quality (IQ) team.

oWorkers has structured its IQ team to report directly to senior management. This way, the leadership team stays informed on delivery related developments and can intervene as and when required. Besides, the IQ team is engaged in leading improvement projects as well as monitoring the performance of operations and giving feedback to frontline workers.

Management Commitment

A business contract will only work if there is something in it for both parties. That is how the leadership team becomes interested in delivering on a contract. The same applies to data categorization within an enterprise. While there is no specific parameter or tool to look at for ascertaining this, experienced business people can ascertain this aspect during the pre-contract interactions.

With a leadership team that has hands-on experience of over 20 years in the industry, oWorkers remains committed to the highest standards of performance in all projects it takes on.



Being a pure player in the space of data services, oWorkers is a specialist in its chosen area of work. It has been recognized among the top three providers of data related BPO services in the world on multiple occasions.

In brief, oWorkers should be your partner of your choice, as it is for many leading technology companies as well as several unicorn platforms.

Guide on how to Outsource Data Annotation for the Healthcare Industry

Guide on how to Outsource Data Annotation for the Healthcare Industry

Guide on how to Outsource Data Annotation for the Healthcare Industry


The world is an unforgiving place. One has to keep running in order to stay in the same place; at least in relative terms. Everyone else in running, hence so should you.

It applies to healthcare as well. Patient expectations of treatment and care are rising, while expecting to pay less for the same, as treatments become mainstream and acquire volumes.

What is a business engaged in the healthcare cycle, either a pharmaceutical company producing drugs and medicines, or a hospital, providing treatment and care, or an insurer, creating financial solutions for people to pay for healthcare expenses?


What are healthcare solution providers doing?

They are harnessing data and employing smart technology solutions to move forward at a rapid pace, even as breakthroughs in science and medicine happen as and when they do.

Pharma companies are making progress in formulations, repurposing and targeting efforts based on analysis of patient records and clinical trials.

Biotechnology device makers are leveraging data on health outcomes to produce better devices that can peel more layers off a condition.

Insurers are mining information from health plans and correlating with claims to make health insurance cheaper by isolating instances of fraud and predicting claims with greater accuracy.

Hospitals are developing algorithms for efficient allocation of their scarce resources so that a greater population segment can be served with the same set of resources.

As are many others providers engaged in the healthcare industry in one way or another.

The technology solution that has been requisitioned by all of them is what is fairly well known now as Artificial Intelligence (AI) and the process of Machine learning (ML) on which it depends for the level of efficacy with which it can perform.


Data annotation for the healthcare industry – enabling AI

Data annotation is the bedrock on which the superstructure of AI engines is built. The stronger the bedrock, the more reliable the AI engine.

While we may have an understanding of the terms ‘data’ and ‘annotation’ separately, the meaning of the term ‘data annotation’ bears repetition, because of its position as an enabler for the AI which is enabling healthcare industry participants to do more.

AI is the technology that seeks to undertake many tasks done by humans today, using human intelligence.

Like identifying tumors. Or detecting kidney stones. Or teeth degeneration.

To be in a position to do that, AI engines need to be trained to learn how humans think and behave. This training, thankfully, can only be done with the help of humans, by creating training data sets.

The usage of human intelligence to make sense of what we may call raw data, is not a facility available to a machine, or a software program. What a software does have is the ability to ingest information, understand patterns, and apply them without human bias to the next set of data that it comes by, to arrive at conclusions it has been taught to arrive at.

What needs to happen for this is for that raw data to be converted to a format that is meaningful for a machine to ingest based on which rules can be taught.

If an MRI scan is the tool based on which the presence of a tumor can be confirmed or denied, then the software needs to be taught how to read an MRI scan which, otherwise, is a meaningless set of pixels for it.

This is done by creating data sets for ML.

A particular specialist doctor might review a hundred MRI scans every week and arrive at a conclusion based on what he sees in them. To enable a machine to be able to do the same, he needs to mark/ highlight/ point the aspects based on which he has reached his conclusion. This needs to be done in a manner that can be understood by the engine that is being trained. It could be through highlighting the size of a particular organ and attaching an outcome to it, which is then uploaded to the software through ‘computer vision’ that enables the software to ‘take in’ this information.

When done repeatedly, the software is built to create associations such that when the next MRI scan comes to it without any markings, it is able to reach the same conclusions as it has been taught to do by the ML data sets.

This is data annotation. To be more specific, data annotation for the healthcare industry, is the process of converting ‘raw’ data to ‘smart’ data for the purpose of training an AI algorithm. As the annotated data set provided to it keeps getting bigger, the AI engine keeps getting smarter, helping in establishing patterns. A kind of equation building process which will enable it to find an ‘x’ the next time it encounters a set of ‘y’s.


Why outsource data annotation for the healthcare industry?

If we are brutally honest with ourselves, data annotation is a critical but perhaps one of the most monotonous, dull, unappreciated jobs in the whole AI and ML process. And, whether we like it or not, if we have to achieve some sort of progress on AI, this process needs to be done by human data annotators.

In many other industries, data annotation may be a monotonous but straightforward task that can be done by anyone with a little training, like identifying objects on a street while building a training data set for autonomous vehicles. To annotate data for the healthcare industry is a different ball-game altogether that needs to marry a certain amount of knowledge of medicine and healthcare with all the other skills required for the task. Lives will depend on their work.

After all there are only so many ophthalmologists and endocrinologists whose priority is patients, not marking MRIs and CT scans for building AI models. This is where specialised medical data annotators complete the jigsaw.

With the development of AI, the task of data annotation has become a job category by itself, under which medical data annotation could be a further specialisation. They are available in larger numbers than doctors and for more reasonable prices too. Hence it stands to reason to permit this group to do data annotation for the healthcare industry.

The choice should be quite clear. To engage an outsourcing outfit that specialises in data annotation solutions. That lives and breathes Data Annotation.

Of course, at the start of any exercise, experts in the particular field, radiologists or cardiologists, may be required to ‘show the way’ and train the resources who are going to be doing the major part of the exercise.


Choosing a partner

The decision to outsource having been taken, selection of a provider would be the logical next step. What parameters would you use to separate suitable providers from the unsuitable ones?

Prior experience

It would be desirable to select a partner with prior experience in data annotation for the healthcare industry. Though experience of providing these services to competitors is ideal, as they are likely to be the most similar to your work, it could create an additional sensitivity of data security as one is always interested in knowing what a competitor is doing. Hence, this may need to be viewed in conjunction with the partner’s ability to provide comfort on data security.

oWorkers has successfully executed a wide variety of data annotation projects covering a wide range of data types and annotation services, over the eight years of their existence.

Accuracy and quality of delivery

The final goal is great quality, irrespective of the type of work, regardless of the type of data annotation. When we enter into a commercial contract, while we are looking for many things, the one thing we always want is great quality of work. Of course, at times we need to make compromises because of budgetary constraints and other reasons, but for a given set of constraints, we want the best quality.

The provider should be able to demonstrate the ability to consistently provide superior quality and accuracy. Testimonials from existing clients is generally accepted as a good way to establish the quality and accuracy delivered on existing contracts.

99% is the accuracy oWorkers delivers across contracts, across different measurement systems, and the same is on offer for your outsourcing project. Many of our clients can be referenced.

Speed and turnaround time

The faster you finish a task, the more you will be able to do, is the simple logic. Business always wants more. Why should healthcare be any different?

To annotate data for the healthcare industry could be a painstaking activity, to be done with care, where data keeps building up gradually. Speed in this context refers to not only the rate at which each transaction is processed, but also to the partner’s ability to find the capacity to process greater volumes so that the AI engine for which it is being done, can be up and running.

With three global centers and 24×7 operations, oWorkers can not only deliver to exacting turnaround time expectations, but also create capacity to work with specialists in order to deliver larger volumes.

Access to talent pool

Human input being a pre-requisite to annotate data for development of AI, the need for human resources in the right quantity with the right skills and training is a dependency. Attrition being a feature of the BPO industry, the need for hiring is continuous, even if the business is not growing, as there will be a requirement to fill the gaps created by people leaving the company. Hence, access to a talent pool for year-round hiring is a requirement.

With the deep commitment to the communities we work in, oWorkers is seen as a preferred employer and benefits from a regular flow of interested candidates approaching us for employment. This also keeps our hiring costs in control as we do not need to spend money on attracting talent. With our philosophy of working with employees, and not freelancers, gives oWorkers the flexibility of deployment based on requirements, and helps create long-term relationships.


An essential part of any commercial engagement, a party delivering goods or services under a contract receives value for it, usually in money terms, based on agreed terms. Also called ‘pricing.’ In a B2B engagement, the basket of services and products provided is unique to the buyer, as is the price for it. While low is desirable, the outsourcer needs to ensure that the pricing terms offered will add value to their business instead of opting for the lowest number.

At oWorkers, with our nearshore and offshore centers, you have the potential to save up to 80 % on your cost prior to outsourcing. We also offer you a choice between rate per unit of output and rate per unit of resource in a transparent manner.

Multi lingual

The consistency of processing enables an organisation to expand the volume of processing and build connectors to processes in and out of it. When you seek a partner to annotate data for the healthcare industry it is important to have a partner who is able to offer multilingual processing support as that will be a key factor when the business grows and expands across the globe, as we are, today, more global and connected than we have been at any point of time in human history.

Across the three centers of oWorkers, support is provided in over 22 languages for a wide variety of data services.

Internal Quality

Whatever be the business process outsourced, Internal Quality has come to occupy an important role in ensuring that delivery teams stay true to the task committed to a client and there is a system of monitoring in place before gaps, if any, become visible to the client. Outsourcing of data annotation for the healthcare industry is no different.

With a mix of QA (Quality Assurance) and QC (Quality Control) processes supported by technological tools, oWorkers delivers best in class performance which also supports our delivery of over 99% accuracy. The Quality team, with independent reporting lines directly to senior management, also ensure that the leadership team is kept abreast of developments and are equipped to intervene as and when a need may arise.

Access to Technology

To annotate data for the healthcare industry might sound like ‘technology for the sake of technology’ as it needs to be done to create a functional AI engine, which is, again, technology. But on a deeper look we will find that AI is not the end game. The AI is being created for a purpose, which could be to analyse MRI scans faster, or evaluate many more CT scans as compared to a human and do it more accurately.

Advancements in technology are changing the world, even to the extent of accelerating the development of technologies like AI. Access to technologies for doing data annotation is a useful resource for this purpose.

Our partnership with leading data annotation tool owners for both NLP and Computer Vision projects gives oWorkers access to the best technology solutions, including upgrades to newer versions as and when they take place.

Data Security

Data security is linked to technology through an umbilical cord, since data is stored digitally and moved digitally for transaction processing. Data being a critical resource for a business, ensuring its security becomes a key determinant in vendor selection. When the same vendor works for competitors too, which gives them the benefit of prior experience and knowledge, it becomes even more important.

Firstly, GDPR compliance is a requirement for oWorkers, not a choice, as we operate out of the Eurozone. In addition, we offer super secure facilities and protocols for your data security with ISO certifications (27001:2013 & 9001:2015). Our staff also sign NDAs (non-disclosure agreements).


Variation in volumes is a common feature of business. The business of data annotation for the healthcare industry is no different. Some businesses retain staff at the peak levels so that transaction flow can be handled. These extra resources, during lean periods, are an additional cost for the business. Some other businesses are able to handle short-term peaks by taking on short-term additional resources and staying lean the rest of the time. BPO providers, if they offer the facility of short-term resourcing, can be of great service to clients, as it enables them to stay lean.

For most projects, our local community associations enable oWorkers to ramp up and down fast. To be more specific, by a hundred headcount in 48 hours.


The oWorkers Advantage

As a pure player, specialising in Data and Content services with multilingual capability, oWorkers stands tall amongst its competitors. Our delivery centers are located in three global locations providing the benefit of business contingency in times of need.

All our workforce remains prepared to work from home when required. Our management team has over 20 years of hands-on industry experience. Locally registered in the global centers it operates from, oWorkers has been a consistently profitable enterprise.

As a result, we have been a trusted partner of several UNICORN marketplaces over the years.

Partnering with us creates positive social and economic change through employment in underserved communities. By working with us, you help bring motivated individuals into the global digital economy.

What Is Data Annotation And What Are Its Advantages?

What Is Data Annotation And What Are Its Advantages

What Is Data Annotation And What Are Its Advantages?


In order to understand data annotation, it is essential to take a step back and first understand: what is the need for data annotation?

In 1889, Charles H. Duell who was the Commissioner of the US patent office, is reported to have said that “Everything that can be invented has been invented.” This was in the context of saying that the patents office may soon need to downsize, or even close as a result.

Did that happen?

Developments have been quite to the contrary and we have been witness to technological innovations rapidly gathering pace and affecting almost all aspects of our lives. Whether it is controlled flight or nuclear power or antibiotics or television or computers or the internet, all these have been developed or invented after Mr. Duell’s assertion.

Does the pace of development look like slowing down?

Quite to the contrary, the pace has never been hotter. It seems mankind is always on the cusp of breakthrough inventions destined to change our way of life.

In its identified niche of data services, oWorkers provides data annotation services and other data-services support needs. We have been identified as one of the top 3 data entry services providers in the world.


Artificial Intelligence

One of the developments that has been gradually gathering steam in the background and is now entering mainstream usage in daily lives is that of Artificial Intelligence, or AI.

The road to understanding ‘what is data annotation’ passes through the center of Artificial Intelligence (AI).

AI is the term used for technologies, or software programs, that have the ability to mimic human behavior.

We know human beings are the most intelligent life form. At the same time, we also know human beings are irrational. We know human beings have good days and bad days. We know human beings have mood swings. We know human beings have prejudices and personal preferences.

What if we could harness human intelligence but deploy it in a manner that takes the human frailties out of the equation? Would that not be the perfect world?

That is the premise AI is based on. And that is the effort people have been making; develop AI algorithms or AI engines that can mimic human behavior in an impartial, objective, consistent, efficient manner.

But, of course, it is not simple. Human intelligence is the handiwork of millions of years of evolution. Expecting it to be replaced by a machine merely by snapping your fingers or flicking a switch is not possible. It is a slow, tedious, painstaking process also known as Machine Learning, or ML.

Human beings have known for many decades how to create a software program using formatted text, or software coding. Formatted text, or coding, and the subsequent interpretation of software code by a computer, drives computers, in a very layman-esque definition. If human beings had not developed coding and its interpretation, computers would not have done anything.

We are, in a way, going through a similar climb with AI. The additional challenge with AI is that the computer needs to interpret not just coded language which it has been doing so far, but raw data that it takes in without any formatting or coding. Hence, it needs to be taught how to recognize and understand the raw data that it is expected to encounter and respond to it in a manner befitting a human mind.

Let us take the most commonly quoted example of AI application these days, that of self-driving cars. The AI that controls the car, unfortunately, does not have innate intelligence like a human being. If a person is crossing the road, a human driver will slow down or stop or swerve to avoid hitting the person. For the AI that gets the image/ video of the road ahead, it is just raw data, without any attached meaning. It has to be taught that if in its ‘vision’ it encounters a shape that has certain dimensions, it means that it belongs to a human being and since we don’t drive cars over human beings on the road, it should stop or slow down or swerve to avoid that object. By doing this, say, a million times, the AI engine builds up a database that allows it to identify a certain shape or set of shapes with human beings who cannot be run over. This is what ML does. The million instances of raw data fed into the machine are known as training data sets that build the machine’s knowledge.

The end goal is that once training is over, and the AI is driving the car, it will recognize a human being if one comes into its range of vision and operate the car as required according to its programming.

The foregoing equips us to address the ‘what is data annotation’ question in the ensuing section.

As a leading provider of data annotation services, oWorkers has been supporting global clients in developing and shaping their AI models with the help of its experienced and trained workforce. Led by a management team with over 20 years of hands-on experience.


What is data annotation?

But what about data annotation, which is what we were trying to understand?

The process of enriching ‘raw’ data in order to create ‘intelligent data’ that can be understood by an AI engine, and that constitutes training data sets, is known as annotation.

It is actually pretty close to the English meaning of the word annotation which, according to, is ‘a note that is added to a text or diagram, often in order to explain it.’

Data annotation, as defined by  Techslang, is ‘the process of labeling information so that machines can use it. It is especially useful for supervised machine learning (ML), where the system relies on labeled datasets to process, understand, and learn from input patterns to arrive at desired outputs.’

In the example of the self-driving car, the process of identifying and marking the human on the road in a manner that makes it through to the AI engine, is the answer to ‘what is data annotation.’


Advantages of Data Annotation

Data annotation being a facilitator in the journey of building reliable AI, and not the final output, its advantages can be linked to making the AI engine effective and reliable. That is both its purpose as well as key advantage. Its advantages are inextricably linked to the advantages of AI

Data output often suffers from the GIGO (Garbage In Garbage Out) principle. The quality of output one can expect from a machine or a computer can only be as good or as bad as the input data it received and processed. While good input data might still be spoiled by a software program or human intervention, bad data can never lead to good outcomes. Hence the biggest benefit of data annotation is that, when done well, it leads to the creation of a reliable and smart AI engine.

A related benefit could be articulated as that of customer experience that will result from a reliable and smart AI engine as opposed to an AI engine that behaves like a bumbling idiot. In fact, in the right application, customer experience resulting from an AI engine could be far better than that from a human interaction. As a simple example, in case of a request for retrieval of information, an AI engine will probably do it much faster than a human.

It is a task of great responsibility as the future depends on it. oWorkers operates from secure facilities in three geographies and multiple centers across Egypt, Bulgaria and Madagascar and is not only GDPR compliant but also ISO (27001:2013 & 9001:2015) certified. Our partnerships with technology providers ensures that we have access to the latest technologies for data annotation and other work.

A good way to appreciate the benefits of data annotation would be to review a few ‘use cases’ or applications of AI that emerge after the model has been trained and implemented. Understanding the benefits delivered by a process often leads to a better understanding of the process, as should be the case as we try to unravel the layers of ‘what is data annotation.’

Enhancing Social Media Content Relevance

Have you ever noticed that if you look for flight tickets from, say, Miami to Detroit, for some time after that, you might start receiving pop-ups and advertisements with promotional fares for the sector. This is AI at work, based on the algorithm that the engine has been taught. A bad engine might just note the sector and start feeding you with promotions for the sector. A better engine might even note whether you managed to book or not, and send promotions your way only if you failed to book in that attempt.

Social media thrives on feeding customized content to users based on their profile and the footprints they leave behind on the platform while engaging with it. With the aid of data annotation, owners of the platform will strive to feed content that is relevant and personalized.

Security Monitoring

Streetside cameras, hundreds of them, have been installed in a particularly sensitive part of town where incidents of theft and mugging have been on the rise. The footage is beamed to a control room where the policemen on duty are expected to look at the feed coming in from the hundred plus cameras and identify potential flashpoints and alert the cops on patrol. It is a cumbersome exercise to constantly switch from one to the other and so on. It causes fatigue and several people have to be deployed simultaneously.

An AI engine was developed and taught to analyse the feed coming from the cameras and warn the cops on duty of potential danger. For example, if the AI engine has been taught to identify a weapon being carried by a passer-by and raise an alert, it can do so almost instantly. Not only that, it can monitor the hundred plus feeds all by itself, releasing the cops doing this to do the real work of patrolling the street and adding to the force available there.

Autonomous Vehicles

This being the current favorite example of AI application, does not need much explaining. It has also been covered in an earlier part of this post.

Search Engines

Operating in a manner somewhat similar to social media, the objective of AI engines used by search engines is to make the information contextual and relevant.  

For example, if the query is about the weather, at the simplest level, knowing where it has been asked from will make a difference to the results. If the engine knows that the person asking the question is an avid skier, returning results relevant to the person’s passion might be an absolute delight for the user. All it needs is the training data to be annotated in a manner that enables it to recognize this fact about the user.


The oWorkers advantage

oWorkers has strategically adopted the ‘employee’ model as opposed to the ‘freelancer’ model for its operations. While it brings upon us greater responsibility with regard to our staff, it enables us to exercise greater flexibility in terms of client requirements. As contributing members of local communities we are established as employers where many would wish to work, which gives us a steady stream of incoming jobseeker applications, substantially reducing our cost and effort in recruitment and training, while reducing our attrition. It also gives us the room to cater to short-term spikes in client requirements.

Being located in cultural melting pots, our teams are multilingual and offer services in 22 of the most popular global languages. All our centers are equipped to operate 24×7 if client operations need it.

Our pricing is transparent. Usually we offer a choice between cost per unit of time and cost per unit of output to clients. We have been a steady profitable enterprise, with efficient operations allowing us to share benefits with clients, and operate as locally registered entities. Our staff regularly rate us above 4.6 on a scale of 5 on sites like Glassdoor on satisfaction.

Along with the ‘what is data annotation’ question, this should address the ‘who to partner with for data annotation’ question as well.

10 Steps to Outsource Video Annotation

10 Steps to Outsource Video Annotation

10 Steps to Outsource Video Annotation

Video Annotation – An Overview

An annotation, according to, is a note that is added to a text or diagram, often in order to explain it.

An answer on Quora defines video annotation as ‘…the process of adding tags and labels to the video clips. It is used to help the computer vision-based models easily detect and identify the objects.’

Video annotation, in other, simple terms, is the process of adding on information to a video which would enable a software or computer to understand the data that a video represents.

Video is a complex set of data that contains information in many forms. From a contextual understanding of all of them together, a human being, through her intelligence, makes sense of this data. In its raw form, however, a machine is unable to make sense of it like a human being would. If we have to make it meaningful for a computer to understand, we have to enrich the information in the raw video data with annotations, or tags, so that it can be understood.

One way of looking at video as data is as a sequence of images, each a millisecond after the previous, creating a continuous motion to the naked eye, but identifiable as distinct images by a computer. The movement creates an added dimension to the data already contained in the image.

When we watch the recording of a football game, a human being intuitively understands that players wearing shirts of the same color constitute one team and players wearing shirts of a different color are the opposing team. However, for a computer, at the start, the shirt color is only raw data, and has no contextual reference. To enable it to understand this concept, an annotation has to be done that puts a name or a tag to the two different shirt colors so that a difference is created. Thereafter, through programming, the computer can be taught to treat the two colors as two different entities. Again, taking a simple illustration, if it was a video game, the computer can be taught that shirts of one color shoot at the goal on one end of the park while the other color shoots at the other end. This is what video annotation solutions do for us.

Why is this needed?

Artificial Intelligence (AI) and Machine Learning (ML) are perhaps the most important reason why video annotation is needed. In ML, through annotated data, a computer is taught to make sense of a video. Once it has been fed enough data and provided enough training, it can operate independently. In other words, it will reach a point where for the next video that it is fed, based on the training it has received, it will be able to interpret it in the same manner. For example, if an AI program is being readied to identify armed intrusions, one of the trainings that might be given is to identify a gun and the various shapes and sizes it could come in. Once the training is complete, the AI program can issue alerts based on which guards can take action, instead of having to manually go through footage beamed by hundreds of live surveillance cameras.

oWorkers has been providing data and video annotation services to clients in a variety of industries like Autonomous Vehicles, Medical AI, Satellite & Aerial imagery, Sports, Retail, Augmented Reality, Insurance, CCTV & Security, Robotics and Agriculture. 

The 10 Steps to Outsource Video Annotation

In today’s day and age, outsourcing of video annotation services is almost a given. In fact, it is so much the norm that one may need to justify the case where one is NOT outsourcing rather than when one is.

Outlined here is a sequence of steps that, executed correctly will, hopefully, enable you to get the full benefit of video annotation and realise the purpose for which it was initiated. You must remember that equipping your vendor to do a good job is, eventually, in your own interest.

Identify Your Need

It is great that you think that the right thing to do is to outsource video annotation. Most of the world would agree with you.

But there is a much more basic question you should answer first. Do you have a need for video annotation solutions? In other words, why do you want video data to be annotated? What do you hope to achieve by doing so? Whether to do it inhouse or outsource it is, in a way, only a matter of detail once it has been established that there is a need for it in your business.

Of course, there are many smart vendors who will be able to tell you why video data should be annotated, because it is this business interest for more businesses to have this need. But they cannot answer on behalf of your business. Only you can. While vendors will willingly tell you how the video annotation process should be run and the benefits it will deliver, you should understand that the final ownership lies with you. Eventually it is an input into your business and needs to be driven by you and your business needs.

Will it solve a problem? Will it add value? What will be the criteria for success or failure?

Establish Selection Criteria

There are a host of factors that should be considered while evaluating potential partners, like technology capability, resourcing strength, multilingual coverage, financial stability, management support, etc.. All these parameters eventually manifest themselves, one way or another, in one of these three parameters in any delivery organisation, and hence should be accorded pride of place in the evaluation criteria:

Domain and function capability and the ability to deliver required quality and accuracy

oWorkers has been a preferred partner of choice for video annotation services for leading organisations, including several unicorn marketplaces, for over eight years. Many of our clients are referenceable. Through our three global delivery centers, we cater to over 22 global languages.

Ability to deliver the right quality within required timelines

With 24×7 operations, oWorkers is able to meet the most stringent timelines on client projects, in many cases delivering overnight. With clients from around the globe, we are also able to leverage time differences to expedite delivery.

Price point at which the capabilities are being made available and its reasonableness

oWorkers offers a choice of per unit or per hour pricing. Clients can choose. Our clients, especially from the US and Western Europe, have reported savings of 80% after outsourcing to oWorkers.

Put The Word Out, Invite Interest

Hopefully the first step has been taken, and it has been established that your business has a need for video annotation services. It has, perhaps, also been established at the same time that it is a time and resource consuming activity which you do not have the resources for. Hence, it has been agreed to outsource video annotation.

The next logical step is to start evaluating potential vendors. However, this does not happen automatically as we live in a world of information asymmetry. In other words, not everyone has access to all material information. Hence, at this stage, you need to ensure that your requirement is known reasonably well in the community from which will emerge the partner that will take over this task for you.

The Request for Proposal (RFP) process is a well-established process for B2B engagements. Whether you follow the RFP process or not, it is certainly recommended that you follow its discipline in some shape or form.

What this means is that you put the word out in such a way that you give out relevant information about the work that you seek to outsource, without divulging confidential information. You could provide information like:

  • Brief description
  • Volume of work
  • Timelines expected
  • Technology/ Tools to be used
  • Manpower qualifications

How does it help?

It helps you by limiting responses to genuine parties who have an interest in doing the work that is available. Without information, there could be a slew of applications on the basis of incorrect assumptions, that you will spend time in reviewing and eventually eliminating because the work is not what they thought it was.

Identify Top Few, or Restart Process

The previous step would, hopefully, have resulted in a reasonable number of responses for video annotation solutions to your RFP or advertisement or whichever method you adopted to get the word out regarding your requirement.

At this stage, you need to put yourself in a position from where you are able to get into detailed discussions with a few suppliers out of which you are able to select a partner.

If you have received a large number of applications, you might want to evaluate the submissions and select down to a few, say two or three, that seem most appropriate.

If you have received a handful, say two or three, that seem relevant, you can avoid the earlier step of evaluation and retain them for a detailed discussion later.

But if you have either not received responses or received responses that are not relevant for your work, then you would need to go back to the drawing board and examine why that was the case.

Did you not define your requirements accurately?

Did you put out expectations that were unreasonable?

Detailed discussion, Final Selection and Contract

This is an obvious next step. You will now be in a position to hold detailed discussions and both parties will be required to share information, ordinarily preceded by the signing of an NDA (Non Disclosure Agreement). The vendors will tell you why they are the best to outsource video annotation to.

Specifics will be discussed at this stage, like pricing, like volumes, like other circumstances that could have an impact on the engagement.

Eventually it will lead to you identifying the most suitable partner and offering them a contract. However, at this stage, or even later, it will be wise to keep the other shortlisted parties interested in some way so that you have a Plan B if and when it may be needed. Selecting a vendor in a B2B relationship is a resource consumptive task. As it has downstream impact on the business that is selecting a partner, it is done carefully and deliberately. One does not want to be in a situation where the engagement breaks down soon after the process of selection has been completed, as it will mean going through it all over again. Hence keeping a second vendor warm is desirable.

The formal agreement is executed based on mutually acceptable terms and conditions.

Trial Run

If a Trial Run has been agreed this would be the time to initiate it. The contract would define the success criteria for this step and the rights of the client and vendor based on various outcomes possible.

If a vendor is not involved, a Trial Run is still a good idea before the organisation fully commits to it, hires resources, invests in technology, and everything else it entails. A sample of representative transactions would be carried out in this phase, enabling a further evaluation of the outcome and benefits, and enabling tweaking the process to enhance effectiveness, where required.

Developing and Implementing the Project Plan

Once it has been established that the organisation is ready to commit to it, a detailed Project Plan for the video annotation solutions will be developed, for both the outsourcer and the provider. This could also be done prior to commencement of the trial run.

The Project Plan will establish timelines for various tasks, define responsibilities, dependencies, checkpoints, controls mechanisms and other variables. In other words, what it means is that the work envisaged in the project is now being mainstreamed. The ‘Project Team’ or ‘Senior Management’ have shepherded the process so far, but henceforth will become a part of the responsibility of various teams based on the task and their role. Thus, the Project Plan defines ownership for various workstreams. Once this is activated, it is ‘all systems go.’

The Project Plan will take the initiative to the point of ‘steady state,’ the state where it should be operating hereafter, till forever, or till something happens that causes it to change, after having gone through a learning and improvement cycle earlier on.

Technology Readiness

When you outsource video annotation, once the Project Plan has been implemented and agreed, there will be various work streams that will be initiated. One of the key ones is technology. The vendor will need to ensure that the requisite technologies are available and ready to begin work for the new client. If client systems are needed, a handshake will be established.

Since we operate from the Eurozone, GDPR compliance is not a matter of choice. It is mandatory. We operate on secure technologies and protocols with ISO certifications (27001 :2013 & 9001:2015). We also offer physical segregation of workspaces where required.

Team Identification and Training

The other major workstream is ‘people.’ The vendor will need to identify the team that will support activities under this contract. Fresh resources will be hired as needed.

As subject matter experts, at this stage, are with the outsourcer, there will be an initial phase of ‘knowledge transfer’ or training during which the vendor staff will be made aware of the processing requirements.

Future training requirements, generally, are handled by the partner without the client staff having to go back to train more and more new hires.

oWorkers operates with employees, not outsourced staff or contractors. This provides it with the flexibility to deploy staff based on need. It has consistently been rated 4.6 and above out of 5 on Glassdoor by satisfied employees. It is deeply connected in the communities it operates in, pays social taxes for its employees and remains a desirable job for most people in the catchment area. With its deep connections, oWorkers gives you the flexibility to ramp up and down by 100 people in 48 hours, despite our strong screening processes based on Education, IQ, Language and Experience.

Ramp, Then Full Steam Ahead!

Eventually, the real work starts. In many cases, when you outsource video annotation, there is a gradual sequence of handling larger volumes each passing day and week as more resources become trained and available, and the vendor gets confidence in the new project. The work gradually ramps up to handle the agreed volumes over a period of time. This is also provided for in the contract and project plan in most cases.


oWorkers is a BPO services company with over eight years of experience and is led by a team with over twenty years of hands-on experience in the outsourcing industry. Being a pure player in data outsourcing services, Oworkers is in a position to provide comprehensive support to clients seeking to outsource video annotation. Our three global processing centers, in the most preferred regions of the world for outsourcing, can provide business continuity to your projects.

10 Steps to Outsource Polygon Annotation

10 Steps to Outsource Polygon Annotation

10 Steps to Outsource Polygon Annotation

Most of us understand what a polygon is.

According to, “a polygon is a two-dimensional geometric figure that has a finite number of sides. The sides of a polygon are made of straight line segments connected to each other end to end. The line segments of a polygon are called sides or edges. The point where two line segments meet is called vertex or corners, henceforth an angle is formed. An example of a polygon is a triangle with three sides.”

 Annotation also should be a well understood word. defines annotation as “a note that is added to a text or diagram, often in order to explain it.”

However, polygon annotation, formed by bringing the two together, might not be as widely understood a term.

Polygon annotation is a technique for identifying an object or shape in a two-dimensional image which can be understood by a computer through ‘computer vision.’

Why is polygon annotation needed?

Artificial Intelligence (AI) and Machine Learning (ML) is the simple answer.

Much has been made of smart machines with AI taking over the world. Turns out machines are not really that smart. Smart humans are training machines to learn to see and think like them so that they can perform some of the tasks human beings have always done.

ML is the input provided to a computer to make it more intelligent. ML consists of large sets of data that are enriched in a way that a computer can ‘see’ it through computer vision, and learn about it. For a computer an image is just data; a random collection of pixels. A car in a picture is, again, just a collection of pixels. But when a human being draws a line around the car and the combined information is fed to the computer, it begins to understand that that particular shape is a car. This is repeated hundreds and millions of times enabling the software to start developing algorithms based on which it can identify the next image that is not marked by human hand.

Polygon annotation is one technique through which objects are marked out by drawing a tight polygon around them. Converting raw data into enriched data allows a machine to learn what humans intuitively know, and use the training to continue to apply the learned rules to future data that is unmarked.   

Polygon annotation solutions make this possible.

When you outsource polygon annotation, a sequence of activities is initiated that, done correctly, has the potential to deliver significant benefits in the form of reliable AI engines. Done incorrectly, it has the potential to unleash mayhem through an incompetent AI engine.

This makes the choice of partner an important decision, assuming outsourcing will be the preferred choice. However, what often gets overlooked is how the entire set of activities, starting from the need for polygon annotation services, through to the point when it starts running like a well-oiled machine, is conceptualised and set up. The selection of a partner is embedded within this sequence.

Outlined here is a recommended sequence that will guide you from the point where you have started thinking that you would like to contract a vendor, through to the end point where they have started delivering.


1. Establish the need for polygon annotation

An outsourcing vendor who provides polygon annotation services to your business, is like an extended part of your organisation, delivering a key component within your entire process or value chain. The process, or value chain, is yours, a detail which should not be forgotten in the focus on getting the most suitable vendor. The vendor is secondary to your process. Of primary importance is for you to establish the need for the process which, as a subsequent step, if it adds value, will be outsourced.

After all, you will not take on a vendor only because he is the best in the world for polygon annotation. You will take him on if your business has a need for polygon annotation and the vendor meets your criteria for selection, whether he is the best in the world or not.

Hence, first establish your business process. Does your business need Polygon annotation or Bounding Box annotation? Or is it Text annotation that will meet your requirements? Once your process has been established, figure out which parts it makes sense to outsource to the BPO company.


 2. Build internal consensus on strategy

Once the process has been established will come the need to decide on whether to outsource or do the process inhouse.

Even though it is common to seek outsourced support for polygon annotation solutions, it is still a decision that needs to be made by each outsourcer for themselves. It is an important decision. Outsourcing to a data annotation company is often considered when the process is voluminous, which means it is likely to be resource-hungry. The process is being done despite being resource-hungry because it has value for your company.

Whichever way we look, it is an important decision. A discussion on the subject will help in bringing consensus on the issue within your company. Moreover, many outsourcing arrangements end up being fractious because the outsourcer might have contracted the process out without either building agreement on the subject or without adequate deliberation on expectations from it.

This process of brainstorming and consultations should enable you to identify:

  • Reasons for seeking a partner; benefit articulation/ quantification
  • Key personnel (in your team) responsible for the engagement
  • Success and failure criteria


3. Get the word out

Once it is clear that outsourcing is the way forward, all efforts need to be made to ensure that your requirement of a partner for polygon annotation solutions be made known widely enough to have interested parties knocking on your door. Otherwise, having work on offer but no takers for it could be an embarrassing situation for a business. In a business situation, having potential vendors apply to you for workplaces you in a better negotiating position compared to when you are asking them to take up your work. You may want to leverage one of the many channels available to businesses, such as:

  • Letting it be known in the business networks and trade circles that you are a part of.
  • Using online directories and Internet search engines to unearth possible partners and communicating the message to them.
  • Issuing a Request for Proposal (RFP) is the process preferred by large enterprises. An RFP is a standardised proposal form where you specify the information you are looking for.
  • Reaching out directly to vendors who may be providing similar services to competitors.

Whichever of the above, or combination, you choose, the process should involve a clear communication on your requirements like an overview of the project, description of the work required, data and file types, output expected, volumes, staff capability and profile requirements, timelines and any other relevant information. The purpose of this is to ensure that only genuinely interested parties apply and you don’t have to sift through large numbers who applied based on incorrect assumptions.


 4. Develop criteria for selection of vendor

A parallel exercise will be to draw up an evaluation process. In other words, a set of criteria that will be used to separate the grain from the chaff, and zero in on the partner you believe is most suitable for your polygon annotation solutions requirement.

If you have not created one for your, the following could be a great starting point for you:

Relevant prior experience

While absence of relevant prior experience may not necessarily be adequate to weed the vendor out, its presence will definitely deliver benefits. It would, for instance, facilitate a quick start to the project instead of a slow ramp. In addition, the requirement of client resources at the start will reduce.

Polygon annotation services have been delivered by oWorkers to several clients over their eight years of existence. The continuing and growing relationships are proof of our delivery capability. Many of our clients are referenceable.

Quality and accuracy

When you wish to outsource polygon annotation, superior quality and accuracy, provided consistently, will be the best advertisement for a vendor.

oWorkers follows a strict QA (Quality Assurance) and QC (Quality Control) process that is independent of the delivery process. The Quality team represents the client and seeks to eliminate errors and poor quality before the client’s processes are impacted. Our Quality team is also the eyes and ears of our senior management. We have consistently delivered over 98% accuracy.

Speed and turnaround

The sooner the partner can produce output, the faster your processes can run, and the more you can produce. Speed is an essential requirement.

With our three global centers and 24×7 operations, oWorkers is well placed to deliver rapidly. In most cases, oWorkers will turn around today’s work even before you begin work the next morning.


When you outsource polygon annotation, or anything else, as long as it is commercial engagement between two parties, pricing will always be a part of the contract. In a B2B contract, generally it will be a unique basket that is purchased, for which the price will also be unique, and negotiated. Of course, lower will be better, except where it seems that it will not be sustainable for the vendor.

Our clients save upto 80% over their current costs when they outsource to us. We are able to offer a choice of per hour or per output unit pricing, with our transparent pricing mechanism.

Multi lingual

In the current context of globalisation and companies doing business around the world, an important consideration for global businesses as well as for ones with aspirations, is to take on a vendor who can support multiple popular languages.

Over 22 languages are supported across three global centers of oWorkers, positioning it uniquely to serve needs of companies, both global and local in any region.

Technology and Data Security

Along with Data Security, which is almost joined at the hip, Technology forms the backbone of every BPO business. In fact, technology is the reason for the existence of the BPO business in its present global shape.

When you outsource polygon annotation to oWorkers, we leverage the best technology for through our partnership with leading providers. We are also GDPR compliant and ISO (27001 and 9001) certified.

Scalability and access to human resources

Apart from technology, people is the other critical component that powers the BPO business, as most BPO processing is, by design, a human activity. If it could have been automated, it would have been. Key parameters, such as attrition and ability to hire for peaks and troughs, should be critically examined when you seek to outsource polygon annotation.

oWorkers possesses the flexibility to ramp up and ramp down, by a hundred headcount in 48 hours, particularly for Computer Vision projects. We work with employees, not contract staff or freelancers, which gives us flexibility in deployment. Our attrition can be considered as best-in-class and our employees consistently rank us 4.6/ 5 or better on Glassdoor.

Financial health and management support

Everything needs money. Financial stress of a company can rapidly percolate down to its operating units, impairing performance. Hence financial stability is an important consideration, as is compliance with legal requirements wherever it operates.

oWorkers has been a consistently profitable enterprise. It operates as a locally registered company in each of the three current locations, pays local and social taxes for its employees and is deeply rooted in the communities it operates in.


 5. Shortlist vendors

In the ideal world, we would like to do a detailed evaluation for each applicant for polygon annotation services. However, in the real world, this is not possible. We have to balance the effort with the reward from it. There could be frivolous applications, there could be incorrect or incomplete applications, there could be clearly unsuitable applications. One does not need to do a more detailed analysis to weed them out at this stage, as it will allow you to better focus on the relevant few than if you were to continue with all.

Based on the detailed checklist for evaluation, if you have a lot of applications, you should funnel down the list to a few that seem most suitable.

Of course, if you don’t get responses, or only a few unsuitable responses, you may need to go back to the drawing board and review your communication and the terms you have broadcast. Perhaps you have been unrealistic in your expectations and vendors did not find it worthwhile to apply.


6. Detailed discussion, evaluation and finalisation

This is when the detailed discussions will happen and both parties will be required to share information. In most B2B cases, this will be preceded by the execution of a Non Disclosure Agreement (NDA) which binds both parties to treating the information received as confidential and enjoins them to ensure it is handled with the utmost care. This is the stage where the potential vendor will make a case for being selected, scope of services will be discussed, including indicative pricing. This is one of the key phases of the process.

Eventually you will decide on the most suitable partner and issue a Letter of Intent. This will be after detailed discussions, site visits, interacting with staff members, exploring combinations, negotiating on price and service levels. Contractual terms, which would have been already discussed during the previous stage, would now be formalised. The others interested and shortlisted vendors will be retained as backups if this partnership falls through for some reason.


7. Finalise terms and sign contract

The formal agreement to outsource polygon annotation is executed based on mutually acceptable terms and conditions that would have been discussed during the phase of detailed discussions.

Of course, there could be some cases of disagreement on the overarching legal clauses which may need to be sorted out between the legal teams or senior management. In most cases, once the operational details have been agreed, the legal terms will generally be ironed out.


8. Implementation of Project Plan

A high-level project plan would have already been discussed and agreed at the stage of detailed discussions and contracting. Once the contract is signed, the Project Plan or Implementation Plan is now detailed out to include all micro activities and tasks and responsibilities, so that the broad dates and milestones discussed earlier are met and the engagement reaches a point which is called ‘steady state’ or Business as Usual (BAU) by different businesses and handed over from the Project Team to the Operations team, to be handled, in a way of speaking, ‘from there to eternity.’


9. Technology, training, hiring and support

The Project Management team will give the go-ahead to the various vendor teams to include the new project in their support plans. This could include hiring. This could include identification of a physical workspace. This could involve identification of the trainers and QAs. This could involve initiating the process for a technical handshake with the client’s team.  

The client will arrange the initial training as well as make arrangements for the technical handshake from their side.


10. Begin work, test, ramp and then go full throttle

Work begins. If volumes are large, there is a ramp-up generally provided for in the Project Plan. Starting slowly, the work gradually ramps up to handle the agreed volumes, clearing milestones, meeting quality benchmarks., till it reaches the envisaged end state at which point the project team pulls back and lets the operational teams on the two sides take over the engagement.

Note: This is an indicative sequence of steps designed to enable you in achieving your goals when you seek to outsource polygon annotation. Not all steps will be needed in all cases. There could be additional steps in some other cases, like a Trial Run. Some projects may not need any technology integration. The exact sequence should be worked out based on your requirements.

10 Steps to Outsource Data Labelling

10 Steps to Outsource Data Labelling


What is data labelling?

Let us illustrate with an example.

A Special Intelligence Unit (SIU) of the armed forces of a particular nation has been given responsibility for handling counter-insurgency operations. The unit keeps tabs on various outfits suspected of carrying out insurgencies, and tracks their communications and movements to get advance warning about their intentions.

It started with monitoring their messages on mobile networks. As both sides became more sophisticated, operations expanded to cover monitoring voice communication, including live calls, their geographical position and propaganda videos circulated by the outfits. While the unit has been effective in controlling insurgency, the enhanced coverage has resulted in an expansion of the force consuming more and more public money, putting the government in a tight spot financially.

A leading software company has assured the government that they will develop an Artificial Intelligence (AI) engine that will take over the task of monitoring, substantially reducing the need for manpower and cost. They have requested the SIU for historical data based on which they have been reaching conclusions regarding the insurgents and their activities.

After satisfying themselves regarding the security of the data, the force shared files with them containing:

  • Text messages intercepted

  • Phone calls intercepted and recorded

  • Geographical locations and change in them

  • Promotional videos

The software company looked at a sample of the content and returned it saying it was of no use to them as it was raw data. While humans, with their intelligence, could interpret that data and make sense out of it, a machine cannot.

Members of the SIU were called upon to identify elements in each piece of data based on which they constructed their theories based on which they could take action. For example, in the text messages, they identified the words, or phrases, that held clues. Similarly, in the phone call recordings, the elements they could identify that held clues. The same with the other sets of data.

Once these elements were identified and linkages established, the data was again handed over to the company for ingestion by the AI engine. This time it was data that the machine understood and could draw inferences from.

What was done by the special forces was ‘data labelling.’ As more and more data was acquired, they also trained the staff of the provider so that they could identify and label the relevant pieces of data themselves.

And the AI engine that could track insurgencies was born. Enabled and trained through ‘data labelling.’

oWorkers provides date labelling services to clients from around the world operating in a variety of industries and enables them to create smarter Artificial Intelligence engines. This could include placing electronic markings like bounding boxes on image files, putting marks on significant areas on faces, tagging pictures with keywords, and many others.


Why outsource data labelling?

Though we have come a long way, there are still occasions when an outsourcing decision, a logical, properly evaluated one that will add value to the business, needs to be justified, only because it is an ‘outsourcing’ decision.

Let us first answer the question, “why outsource?”

How is an outsourcing decision taken? It is not a given. It is a choice. Right? Through an evaluation process which analyses the different variables in the equation and tries to reach a decision likely to be the most beneficial, generally expressed in financial terms.

But these are standard business rules. How is this any different from any normal business decision?


Outsourcing is like any other engagement between two parties. These parties could be individuals, businesses, governments, or any competent body permitted to take decisions.

Many people send their children to school. Is that not a form of outsourcing? We are outsourcing our children’s education to an organization known as a school instead of doing it ourselves.

We buy food from stores. Is that not outsourcing our food supplies to others instead of producing ourselves?

An outsourcing contract will materialise only if there is interest and mutual benefit of the two parties to the contract. Exactly like any other contract between two businesses or human beings.

With this background on outsourcing, we can move forward to answer the question, “why outsource data labelling?”

Data labelling can be one of the more monotonous jobs anywhere. Looking at similar sets of data again and again, to identify the same elements. And it has to be done by humans.

Every organisation employs skill sets that are core to their work. A restaurant employs people to cook and to serve. An insurance company employs people like actuaries to evaluate risk on events that enable them to offer insurance policies. Besides, the employed people have perhaps been interested in that profession and undergone education and training to become suitable for employment.  

Data labelling services calls for a skillset quite distinct from the skillsets employed by the business that has this requirement; of staff who are experts in data labelling. Moreover, since it is a repetitive task for which training is mostly provided on-the-job, the services of these experts are reasonably priced. 

It is highly unlikely the business would like to deploy their cooks and actuaries to label the data that needs to be fed to their under-development AI engine. They will probably make a hash of it, while at the same time ignoring their core responsibility on which rests the success of their employer. The business is likely to prefer an outsourced service with cheaper labour and perhaps with tools and applications meant to facilitate the task while exercising process control?

This, then, in simple terms, is the case for outsourcing Data Labelling Services.

Of course, at the start of any exercise, experts in the particular field, cooks or actuaries, may be required to ‘show the way’ and train the resources who are going to be doing the major part of the exercise.

To summarise, the following are the main reasons why many organisations find it preferable to outsource data labelling:

  • Enables them to focus on core activities of the business

  • Allows specialised attention to be given to the task by specialists, reducing errors and enhancing efficiency of the process

And even more preferable to leverage data labelling solutions offered by oWorkers, across industry segments like Autonomous Vehicles, Medical AI, Satellite & Aerial imagery, Sports, Retail, Augmented Reality, Insurance, CCTV & Security, Robotics, Agriculture and several others. 



The 10 steps to success when you outsource data labelling

Selecting the right vendor is often the focus of effort for most outsourcers, but there is a lot more that needs to be done, both before and after the vendor is identified. Without adequate attention to each of these steps, the end result could be sub-optimal. As an outsourcer, you wouldn’t want that to happen.

Outlined here is a recommended process that will guide you from the point where you have started thinking that you would like to data labelling solutions, through to the end point where the vendor has started delivering.


1. Identify Requirement

Many business deals do not succeed because the client (buyer) is not clear on the requirements. What are they buying? Why are they buying? What problem will it solve? What value will it add? Identification of your need is a good starting point to outsource data labelling. Additionally, some clarity on success (or failure) criteria will be even better. A recommended checklist of items that this phase should cover:

  • Objective of outsourcing

  • Type/s of data

  • File formats

  • Annotations required

  • Defining rules

  • Domain knowledge requirement

  • Timelines

  • Volume


2. Advertise requirement/ Seek participation

Once it is clear that you would like to consider outsourcing as an option, you would need to make known to the prospective vendor community about it, with information from relevant items of the checklist created in the earlier phase. This can be done in many different ways:

  • Issuing a Request for Proposal (RFP) is the process preferred by large enterprises. An RFP is a standardised proposal form where you specify the information you seek.

  • Reaching out directly to vendors who may be providing similar services to competitors.

  • Advertising in trade circles, if you are a part of them.

  • Look up prospective vendors online or through directories like Yellow Pages and inform them one by one of your requirement. They will respond if interested.


3. Shortlist vendors

By now you would have hopefully received some interest from prospective vendors. If you have been flooded with responses, at this stage, based on information received, you should shortlist down to a few, perhaps two or three, with whom you can engage in a more detailed manner. A B2B engagement is a time-consuming affair. The larger the shortlist the more of your time it will require. If responses are inadequate or absent, you may need to review the terms and conditions you have set out. Perhaps they are too strict for vendors to be interested in.


4. Detailed discussion and evaluation

This is where the detailed discussions will happen and both parties will be required to share information. In most B2B cases, this will be preceded by the execution of a Non Disclosure Agreement (NDA) which binds both parties to treating the information received as confidential and enjoins them to ensure it is handled with the utmost care. This is the stage where the potential vendor will make a case for being selected, scope of services will be discussed, including indicative pricing. This is one of the key phases of the process.

A recommended list of capabilities and attributes to look for in a partner:

Prior experience

Facilitates a quick start instead of a slow ramp-up.

Over the last 8 years, oWorkers has successfully executed several data annotation projects for global clients. oWorkers provides proficiency in a wide variety of annotations, like Bounding boxes annotation, Keypoints annotation, Polygons annotation, 3D cuboids annotation, LIDAR segmentation, Text categorization, Linked entities recognition, Grammatical & discourse analysis, Review & sentiment analysis, Moving bounding boxes and Objects tracking (SOT,MOT).

Quality and accuracy

Consistently providing superior quality and accuracy is the best way to advertise your capability. Testimonials from existing clients are also helpful.

With QA (Quality Assurance) and QC (Quality Control) processes that represent the client and aim to detect and resolve errors, oWorkers consistently delivers over 98% accuracy. It leverages bast in class technologies in this effort.

Speed and turnaround

In a fast-paced world, speed is of the essence. The sooner the AI engine can ingest requisite data, the faster it can be brought to market.

In most cases, oWorkers will turn around today’s work even before you begin work the next morning.


In any commercial engagement, this is a given. One party deliver goods or services, and the other pays a price for it in money terms. In a B2B, the basket of services and products provided is unique to the engagement, as is the price for it. While low is desirable, the outsourcer needs to ensure that the pricing terms offered will add value to their business instead of opting for the lowest number.

Our existing clients save upto 80% when they use our data labelling solutions. The same opportunity is available to all. Our pricing is transparent, with a choice of per hour or per output unit pricing.

Multi lingual

This is an important consideration for global businesses as well as for ones with aspirations, instead of having to take on a vendor for every new language.

Supporting over 22 languages across three global centers, oWorkers is uniquely positioned when you look for partners to outsource data labelling.

Technology and Data Security

Technology forms the backbone of every business. Along with data security. Data labelling services are themselves an input to Artificial Intelligence, a developing technology. Harnessing the right technologies is an important consideration for success.

ISO (27001 and 9001) certified, oWorkers leverages the best technology for data labelling solutions through its partnership with leading providers.

Scalability and access to human resources

Since BPO processing is a human activity, suitable resources, at the right price, in adequate numbers, should be available to the partner to enable them to carry out this task. Also enables business calling up and down as per requirement.

oWorkers possesses the flexibility to ramp up and ramp down, by a hundred headcount in 48 hours, particularly for Computer Vision projects.

Financial health and management support

Thought not relevant for operational delivery, these factors are important for ensuring consistency. Financial headwinds for the partner could lead to cutting corners in all projects.

oWorkers has been a consistently profitable enterprise. It pays local and social taxes for its employees and is deeply rooted in the communities it operates in.


5. Shortlist down to 1

Eventually you will decide on the most suitable partner and issue a Letter of Intent. This will be after detailed discussions, site visits, interacting with staff members, exploring combinations, negotiating on price and service levels. Contractual terms, which would have been already discussed during the previous stage, would now be formalised. The others will be retained a backups if this partnership falls through for some reason.


6. Finalise terms and sign contract

The formal agreement is executed based on mutually acceptable terms and conditions.


7. Trial Run

Initiate a Trial Run if agreed upon. The contract terms would shed light on the success criteria for this step and logical next steps based on various outcomes possible.


8. Implementation Project Plan

If this has not been done at the contracting stage, the parties will develop and agree on an Implementation Plan which defines the steps each of them must take to reach a steady state. In other words, reach a point at which the activities envisaged in the contract are running at the expected level. The timelines are also defined in the Implementation Plan.


9. Team identification, training and ramp support

The vendor will need to identify the team that will support activities under this contract while the outsourcer will arrange the initial training. Vendor support teams will start playing their regular roles for this project as well. Technical handshakes required will also be made.


10. Begin work, test and then go full steam

Work begins. If volumes are large, there is a ramp-up generally provided for in the Project Plan. Staring slowly, the work gradually ramps up to handle the agreed volumes.

Note: This is an indicative sequence of steps and not mandatory. Not all steps will be needed in all cases. In some cases, the sequence could also change, like a Trial Run could happen before a contract is signed.



oWorkers, as a pure-play data entry BPO company, has a unique position in the industry, and has been ranked in the Top 3 globally for data entry services. The solutions it offers are scalable, offering flexibility to ramp by 100 resources in 48 hours, a demonstration of their deep connect with the communities they work in. Its resources are employees, not contractors.

Our carefully selected and trained staff, are equipped to handle data labelling services for a variety of uses like Sentiment Analysis, Named Entity Recognition, Geo Labelling, etc. With support for 22+ languages, ISO 27001:2013 and 9001:2015 certifications, 24×7 operations and centers in Europe and Africa, oWorkers should be your first choice to outsource data labelling.