Internship Project/Data Labeling and Update Tool
(25.02.18) User Scenario
Key Features
- Data Retrieval & Access
- Retrieve data by unique ID or fetch all data.
- Search and filter datasets based on various criteria.
- View the latest update log for specific data upon retrieval.
- Data Modification & Labeling
- Update values of fields.
- Modify entire field values.
- Replace words (or column values) with labels.
- Perform CRUD operations on labels (create, update, delete annotations).
- Update values of fields.
- Logging & Version Control
- Track modifications with detailed logs (who modified, when, and what changed).
- Retrieve previous versions of modified data.
- Revert to an earlier version if necessary.
- Role-Based Access Control
- Log in with assigned roles (reviewer, admin).
- Access only assigned datasets based on user roles.
- Register users with different roles and manage access permissions.
- Data Organization & Assigning
- Sort and group datasets based on various criteria.
- Admin assigns dataset groups (and specific rows) to users.
- Allow multiple users to label and modify data simultaneously.
- Reference Template Uploading
- Upload reference template data (e.g., no_sql_template, sql_template) for easy access and usage.
- Admin can upload new templates or modify existing ones.
- Associate templates with specific datasets for quick reference during labeling and data modification.
User Scenario
- Personas(By Role)
Role | Description |
admin | Manages data organization(assigning data to users ), user roles, logging, and reports and access control. |
reviewer(labeler) | Responsible for data labeling, modification, and review, ensuring data accuracy and consistency. |
Admin Scenario
(Grouping & Sorting Datasets, Assigning to Users)
Step 1: Log in (Sign in) & Selecting Data
- Log in with administrator credentials(ID/PW).
- Retrieve all available datasets and samples.
- Select a dataset.
Step 2: Sorting & Grouping Data
- Sort sample data based on status(data was updated or not), Sample ID, SQL Template Type(number) or other custom criteria.
- Select the specific sample datas.
- Create dataset groups with the sample datas & Dataset Metadata(e.g. Assigned User, Due date, Description) for specific annotation tasks.
Step 3: Assigning Data to Users
- Select a dataset group.
- Assign it to specific users (reviewers/labelers).
- Confirm assignments.
Step 4: Monitoring & Tracking Assignment Completion
- Select a dataset group to monitor.
- Check the progress status (Updated/Not Updated) of each sample within the dataset.
- Filter/Sort the dataset to display status of samples.
- Confirm the dataset.
Step 5: Monitoring & Tracking User Assignment Completion
- Select a reviewer/labeler to track their assigned datasets.
- Check the status (Updated/Not Updated) of each dataset group assigned to the user.
- Can leave comments(or request a re-label/re-update) to the user.
- Confirm the dataset.
Reviewer (Labeler) Scenario
(Reviewing & Updating Assigned Data)
Step 1: Log in & Accessing Assigned Data
- Log in with assigned credentials(ID/PW).
- Check(Navigate to) Assigned Datasets of the user.
- Retrieve datasets assigned by the admin.
- Can Search or filter datasets based on template no, status, metadata etc.
Step 2: Updating & Labeling Data & Checking Validation
- Select a dataset and retrieve the sample datas.
- Select a specific sample of the dataset.
- Views the existing column fields values.
- Update the values :
- Full modification (replacing the entire value).
- Labeling (replacing column values with predefined labels)
- Pass/Skip/Confirmed (no need to modify or update)
- Reviews modifications and verifies query correctness.
Step 3: Submitting Updates to Admin
- Submit the updated sample data of the dataset.
- System logs the modification details (who, when, contents).
Discussion Points: Labeling Feature
- Should labels be CRUD individually?
- Should all labels be uploaded in advance in bulk (at once) and then used for labeling tasks?
- Should labels (column names, etc) be generated based on templates to avoid the need for manual uploads?
- How can related labels be easily selected on the front-end? </aside>
'Internship Project > Data Labeling and Update Tool' 카테고리의 다른 글
(25.02.26) ERD - Simplified (0) | 2025.02.26 |
---|---|
(25.02.20) ERD (Even Sourcing Based) (0) | 2025.02.24 |
(25.02.19) ERD (draft) (0) | 2025.02.24 |
(25.02.11) Open Source Data Labeling & Updating Tool Ideation (1) | 2025.02.24 |