Jump to content

Conventional Commits Specification

From Wikipedia, the free encyclopedia

Conventional Commits Specification (CCS) is the specification formalizing the categorization of commits in version control systems. This classification system distinguishes code changes based on their purpose—such as features, bug fixes, or documentation updates—to facilitate automated processes like changelog generation and semantic versioning.[1]

Background

[edit]

Modern distributed software development relies heavily on commit messages to track changes. While early classification frameworks, such as the one proposed by Swanson in 1976, categorized maintenance into three types (perfective, adaptive, and corrective), modern development has shifted toward finer-grained taxonomies.[1]

The Conventional Commits Specification (CCS) is a widely adopted standard that requires commit messages to follow a specific format:

<type>[optional scope]: <description>
[optional body]
[optional footer(s)]

The mandatory <type> field categorizes the commit into one of ten distinct classes, making the history machine-readable.[1] Additionally, the footer section is frequently utilized to explicitly mark breaking changes using the BREAKING CHANGE token, a feature relied upon by developers and automated tools to identify backwards-incompatible updates.[2]

Classification types

[edit]

Research into CCS usage has identified ten primary categories used to classify commits. To address ambiguity found in earlier definitions (such as those from the Angular project), the following definitions have been proposed to minimize overlap:[3]

  • Feature (feat): Changes that introduce new functionality to the codebase. This includes both user-oriented and developer-oriented features.
  • Fix (fix): Changes that resolve bugs or faults.
  • Performance (perf): Modifications aimed specifically at improving performance (e.g., execution speed, memory usage) without changing behavior.
  • Style (style): Changes that improve code readability without altering meaning (e.g., formatting, indentation, variable naming).
  • Refactor (refactor): Restructuring code to improve maintainability without changing external behavior. This category explicitly excludes changes that strictly fall under "style" or "perf".
  • Documentation (docs): Modifications to documentation or text files (e.g., comments, READMEs).
  • Test (test): Adding or updating test files.
  • Continuous Integration (ci): Changes to CI configuration files and scripts (e.g., GitHub Actions, Travis CI).
  • Build (build): Modifications affecting the build system or external dependencies (e.g., Gradle, Maven).
  • Chore (chore): Miscellaneous tasks that do not fit into the other categories.

These types categorize commits based on two dimensions: purpose (motivation, such as refactoring) and object (what changed, such as tests). In cases of overlap, priority is typically given to the purpose; for example, refactoring a test file is classified as "refactor" rather than "test".[4]

Adoption and usage

[edit]

The adoption of CCS has seen a consistent increase in the open-source community, though rates vary by ecosystem and methodology. A 2025 study analyzing over 3,000 top GitHub projects found that 116 projects had explicitly declared their adoption of CCS in documentation.[5] Projects typically adopt CCS in one of two modes:

  • Document Declaration: Explicitly stating the convention in contributing guidelines.
  • Integrated Automation: Using tools like commitlint or GitHub Actions to enforce the format.[6]

The specification appears particularly prevalent in the NPM ecosystem. An analysis of 381 popular NPM projects found that 360 (approximately 94.5%) contained commits conforming to the Conventional Commits standard. Furthermore, 198 of these projects had over 80% of their commits adhering to the specification, demonstrating a high level of compliance in JavaScript and TypeScript development environments.[7] Even in projects that do not officially mandate the specification, approximately 10% of commits submitted by developers in 2023 voluntarily adhered to the format.[6]

Challenges

[edit]

Developers face several challenges when manually classifying commits according to CCS. A qualitative analysis of developer discussions on GitHub and Stack Overflow identified four main issues:[8]

  1. Type Confusion: The most prevalent challenge (approx. 58% of issues), where developers are unsure which type applies. Common confusion exists between feat vs. chore and overlapping definitions of refactor, style, and perf.
  2. Type Aliases: Requests to use synonyms, such as "patch" instead of "fix".
  3. Changing Types: Requests to add new types (e.g., "security") or remove existing ones.
  4. Lack of Definitions: Calls for a comprehensive, standardized list of definitions, as the official specification often defers to Angular's guidelines, which some developers find ambiguous.[9]

Automated classification

[edit]

Given the complexity and granularity of the ten CCS categories, automated classification has been explored to assist developers. While traditional approaches using BERT have been successful for three-category classification (adaptive, corrective, perfective), they struggle with the finer distinctions of CCS.[10]

Recent approaches utilizing Large Language Models (LLMs), specifically fine-tuned models like CodeLlama, have demonstrated superior performance. A fine-tuned CodeLlama model achieved a macro F1 score of roughly 76%, outperforming both BERT and GPT-4 in correctly classifying commits into the ten CCS types.[11] The categories of chore and refactor remain the hardest to classify automatically due to their broad or residual definitions.[11]

References

[edit]
  1. ^ a b c Zeng et al. 2025, p. 2277.
  2. ^ Kong et al. 2025, p. 111:3.
  3. ^ Zeng et al. 2025, p. 2283.
  4. ^ Zeng et al. 2025, pp. 2283–2284.
  5. ^ Zeng et al. 2025, pp. 2280–2281.
  6. ^ a b Zeng et al. 2025, p. 2281.
  7. ^ Kong et al. 2025, p. 111:7.
  8. ^ Zeng et al. 2025, p. 2282.
  9. ^ Zeng et al. 2025, pp. 2282–2283.
  10. ^ Zeng et al. 2025, p. 2284.
  11. ^ a b Zeng et al. 2025, p. 2285.

Sources

[edit]
  • Zeng, Qunhong; Zhang, Yuxia; Qiu, Zhiqing; Liu, Hui (2025). A First Look at Conventional Commits Classification (PDF). 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). Ottawa, ON, Canada: IEEE. pp. 2277–2289. doi:10.1109/ICSE55347.2025.00011.
  • Kong, Dezhen; Liu, Jiakun; Bao, Lingfeng; Lo, David (2025). "Toward Better Comprehension of Breaking Changes in the NPM Ecosystem". ACM Transactions on Software Engineering and Methodology. 34 (4). New York, NY, USA: Association for Computing Machinery: 111:1–111:23. doi:10.1145/3702991.
[edit]