Content Quality Guidelines
The objective of the Kerala Online Data Repository is to provide high-quality data created by various departments of kerala state.
These guidelines ensure that all datasets published on the platform are:
- Accurate
- Reliable
- Complete
- Consistent
- Timely
- Reusable
To maintain these standards, a stringent metadata vetting and data curation process will be performed for each dataset before publication.
These guidelines apply to:
- Government Departments
- Department Administrators
- Dataset Creators
- Platform Reviewers
1. Data Accuracy
All datasets must:
- Be factually correct and verified by the concerned department.
- Reflect official records or authenticated sources.
- Avoid typographical errors and inconsistent values.
- Undergo validation checks before upload.
Recommended Practices
- Validate numeric and date fields.
- Ensure standard date format.
- Review column names for spelling and clarity.
2. Data Completeness
Datasets must:
- Contain all mandatory fields.
- Avoid excessive null.
- Clearly indicate missing data using standardized terms (e.g.,
NA,Not Available,null).
Mandatory Metadata Fields
- Dataset Name
- Dataset Description
- Department / Organization
- Sector
- Subsector
- Geography / Jurisdiction
- Data Granularity
- Data Owner
- Maintainer Email
3. Data Consistency
- Use uniform date formats.
- Maintain consistent naming conventions.
- Avoid mixing measurement units in the same column.
- Ensure standardized column headers.
4. Timeliness
Each dataset must clearly mention:
- Data Reference Period
- Data Published Date
- Data Release Date
- Data frequency
Recommended Update Frequencies
- Annual
- Haly Yearly
- Quarterly
- Bimonthly
- Monthly
- Weekly
- Daily
Outdated datasets should be reviewed and updated periodically.
5. Metadata Standards
Each dataset must include detailed metadata to improve discoverability and usability.
Required Metadata Elements
- Keywords / Tags
- License Information
- Data Format (CSV, XLSX, JSON, API)
- Geographic and Temporal Coverage
Metadata should comply with the platform's defined metadata standards.
6. Data Privacy and Security
- Personally Identifiable Information (PII) must not be published unless legally permitted.
- Sensitive information must be anonymized.
- Confidential or internal-use-only data must not be uploaded.
- All datasets must comply with applicable data protection regulations.
7. File Format Standards
Allowed Formats
- CSV (Recommended for structured tabular data)
- JSON (For APIs and structured machine-readable data)
- XLSX (For spreadsheet-based data)
Not Allowed
- Scanned PDF files for structured data
- Password-protected files
- Proprietary or non-standard formats
8. Review and Approval Process
Before publication, the following process must be followed:
- Dataset preparation by Department Data Creator
- Internal departmental verification
- Metadata validation
- Platform-level quality review
- Approval by Administrator
Only approved datasets will be published on the platform.
9. Roles and Responsibilities
Department Data Creator
- Ensure accuracy and completeness.
- Validate dataset before submission.
Administrator
- Review and approve datasets.
- Monitor compliance with quality standards.
- Enforce metadata standards.
- Maintain platform integrity.
- Oversee publication workflow.
10. Compliance
Failure to comply with these guidelines may result in:
- Rejection of dataset submission
- Request for revision
- Temporary suspension of publishing privileges
All departments and contributors are expected to adhere strictly to these standards.