Content Quality Guidelines

The objective of the Kerala Online Data Repository is to provide high-quality data created by various departments of kerala state.

These guidelines ensure that all datasets published on the platform are:

  • Accurate
  • Reliable
  • Complete
  • Consistent
  • Timely
  • Reusable

To maintain these standards, a stringent metadata vetting and data curation process will be performed for each dataset before publication.

These guidelines apply to:

  • Government Departments
  • Department Administrators
  • Dataset Creators
  • Platform Reviewers

1. Data Accuracy

All datasets must:

  • Be factually correct and verified by the concerned department.
  • Reflect official records or authenticated sources.
  • Avoid typographical errors and inconsistent values.
  • Undergo validation checks before upload.
  • Validate numeric and date fields.
  • Ensure standard date format.
  • Review column names for spelling and clarity.

2. Data Completeness

Datasets must:

  • Contain all mandatory fields.
  • Avoid excessive null.
  • Clearly indicate missing data using standardized terms (e.g., NA, Not Available, null).

Mandatory Metadata Fields

  • Dataset Name
  • Dataset Description
  • Department / Organization
  • Sector
  • Subsector
  • Geography / Jurisdiction
  • Data Granularity
  • Data Owner
  • Maintainer Email

3. Data Consistency

  • Use uniform date formats.
  • Maintain consistent naming conventions.
  • Avoid mixing measurement units in the same column.
  • Ensure standardized column headers.

4. Timeliness

Each dataset must clearly mention:

  • Data Reference Period
  • Data Published Date
  • Data Release Date
  • Data frequency
  • Annual
  • Haly Yearly
  • Quarterly
  • Bimonthly
  • Monthly
  • Weekly
  • Daily

Outdated datasets should be reviewed and updated periodically.

5. Metadata Standards

Each dataset must include detailed metadata to improve discoverability and usability.

Required Metadata Elements

  • Keywords / Tags
  • License Information
  • Data Format (CSV, XLSX, JSON, API)
  • Geographic and Temporal Coverage

Metadata should comply with the platform's defined metadata standards.

6. Data Privacy and Security

  • Personally Identifiable Information (PII) must not be published unless legally permitted.
  • Sensitive information must be anonymized.
  • Confidential or internal-use-only data must not be uploaded.
  • All datasets must comply with applicable data protection regulations.

7. File Format Standards

Allowed Formats

  • CSV (Recommended for structured tabular data)
  • JSON (For APIs and structured machine-readable data)
  • XLSX (For spreadsheet-based data)

Not Allowed

  • Scanned PDF files for structured data
  • Password-protected files
  • Proprietary or non-standard formats

8. Review and Approval Process

Before publication, the following process must be followed:

  1. Dataset preparation by Department Data Creator
  2. Internal departmental verification
  3. Metadata validation
  4. Platform-level quality review
  5. Approval by Administrator

Only approved datasets will be published on the platform.

9. Roles and Responsibilities

Department Data Creator

  • Ensure accuracy and completeness.
  • Validate dataset before submission.

Administrator

  • Review and approve datasets.
  • Monitor compliance with quality standards.
  • Enforce metadata standards.
  • Maintain platform integrity.
  • Oversee publication workflow.

10. Compliance

Failure to comply with these guidelines may result in:

  • Rejection of dataset submission
  • Request for revision
  • Temporary suspension of publishing privileges

All departments and contributors are expected to adhere strictly to these standards.