Données étiquetées

Partager
" Retour à l'index des glossaires

Labeled data is a crucial element in the field of intelligence artificielle[1] et machine learning[2]. It refers to data that has been tagged or classified with certain labels to provide a context or meaning for that data. For example, an image of a dog could be labeled as “dog” in an image recognition dataset. This labeling process can be done manually, as was done in the early days by a team of undergraduates and Amazon Mechanical Turk workers who labeled millions of images for ImageNet. Alternatively, it can be automated through machine learning models that predict likely labels for new, unlabeled data. This process significantly enhances the efficiency of data analysis and allows for continuous learning and adaptation of models. However, it’s important to note that the quality of labeled data can influence algorithmic decision-making, potentially leading to biases if not statistically representative.

Définitions des termes
1. intelligence artificielle.
1 Artificial Intelligence (AI) refers to the field of computer science that aims to create systems capable of performing tasks that would normally require human intelligence. These tasks include reasoning, learning, planning, perception, and language understanding. AI draws from different fields including psychology, linguistics, philosophy, and neuroscience. The field is prominent in developing machine learning models and natural language processing systems. It also plays a significant role in creating virtual assistants and affective computing systems. AI applications extend across various sectors including healthcare, industry, government, and education. Despite its benefits, AI also raises ethical and societal concerns, necessitating regulatory policies. AI continues to evolve with advanced techniques such as deep learning and generative AI, offering new possibilities in various industries.
2 Artificial Intelligence, commonly known as AI, is a field of computer science dedicated to creating intelligent machines that perform tasks typically requiring human intellect. These tasks include problem-solving, recognizing speech, understanding natural language, and making decisions. AI is categorised into two types: narrow AI, which is designed to perform a specific task, like voice recognition, and general AI, which can perform any intellectual tasks a human being can do. It's a continuously evolving technology that draws from various fields including computer science, mathematics, psychology, linguistics, and neuroscience. The core concepts of AI include reasoning, knowledge representation, planning, natural language processing, and perception. AI has wide-ranging applications across numerous sectors, from healthcare and gaming to military and creativity, and its ethical considerations and challenges are pivotal to its development and implementation.
2. machine learning. Machine learning, a term coined by Arthur Samuel in 1959, is a field of study that originated from the pursuit of artificial intelligence. It employs techniques that allow computers to improve their performance over time through experience. This learning process often mimics the human cognitive process. Machine learning applies to various areas such as natural language processing, computer vision, and speech recognition. It also finds use in practical sectors like agriculture, medicine, and business for predictive analytics. Theoretical frameworks such as the Probably Approximately Correct learning and concepts like data mining and mathematical optimization form the foundation of machine learning. Specialized techniques include supervised and unsupervised learning, reinforcement learning, and dimensionality reduction, among others.

Données étiquetées is a group of samples that have been tagged with one or more labels. Labeling typically takes a set of unlabeled data and augments each piece of it with informative tags. For example, a data label might indicate whether a photo contains a horse or a cow, which words were uttered in an audio recording, what type of action is being performed in a video, what the topic of a news article is, what the overall sentiment of a tweet is, or whether a dot in an X-ray is a tumor.

Labels can be obtained by asking humans to make judgments about a given piece of unlabeled data. Labeled data is significantly more expensive to obtain than the raw unlabeled data.

" Retour à l'index des glossaires
fr_FRFR
Retour en haut