Files
git-mirror/Documentation/i18n.adoc
brian m. carlson 1f010d6bdf doc: use .adoc extension for AsciiDoc files
We presently use the ".txt" extension for our AsciiDoc files.  While not
wrong, most editors do not associate this extension with AsciiDoc,
meaning that contributors don't get automatic editor functionality that
could be useful, such as syntax highlighting and prose linting.

It is much more common to use the ".adoc" extension for AsciiDoc files,
since this helps editors automatically detect files and also allows
various forges to provide rich (HTML-like) rendering.  Let's do that
here, renaming all of the files and updating the includes where
relevant.  Adjust the various build scripts and makefiles to use the new
extension as well.

Note that this should not result in any user-visible changes to the
documentation.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:56:06 -08:00

71 lines
2.9 KiB
Plaintext

Git is to some extent character encoding agnostic.
- The contents of the blob objects are uninterpreted sequences
of bytes. There is no encoding translation at the core
level.
- Path names are encoded in UTF-8 normalization form C. This
applies to tree objects, the index file, ref names, as well as
path names in command line arguments, environment variables
and config files (`.git/config` (see linkgit:git-config[1]),
linkgit:gitignore[5], linkgit:gitattributes[5] and
linkgit:gitmodules[5]).
+
Note that Git at the core level treats path names simply as
sequences of non-NUL bytes, there are no path name encoding
conversions (except on Mac and Windows). Therefore, using
non-ASCII path names will mostly work even on platforms and file
systems that use legacy extended ASCII encodings. However,
repositories created on such systems will not work properly on
UTF-8-based systems (e.g. Linux, Mac, Windows) and vice versa.
Additionally, many Git-based tools simply assume path names to
be UTF-8 and will fail to display other encodings correctly.
- Commit log messages are typically encoded in UTF-8, but other
extended ASCII encodings are also supported. This includes
ISO-8859-x, CP125x and many others, but _not_ UTF-16/32,
EBCDIC and CJK multi-byte encodings (GBK, Shift-JIS, Big5,
EUC-x, CP9xx etc.).
Although we encourage that the commit log messages are encoded
in UTF-8, both the core and Git Porcelain are designed not to
force UTF-8 on projects. If all participants of a particular
project find it more convenient to use legacy encodings, Git
does not forbid it. However, there are a few things to keep in
mind.
. 'git commit' and 'git commit-tree' issue
a warning if the commit log message given to it does not look
like a valid UTF-8 string, unless you explicitly say your
project uses a legacy encoding. The way to say this is to
have `i18n.commitEncoding` in `.git/config` file, like this:
+
------------
[i18n]
commitEncoding = ISO-8859-1
------------
+
Commit objects created with the above setting record the value
of `i18n.commitEncoding` in their `encoding` header. This is to
help other people who look at them later. Lack of this header
implies that the commit log message is encoded in UTF-8.
. 'git log', 'git show', 'git blame' and friends look at the
`encoding` header of a commit object, and try to re-code the
log message into UTF-8 unless otherwise specified. You can
specify the desired output encoding with
`i18n.logOutputEncoding` in `.git/config` file, like this:
+
------------
[i18n]
logOutputEncoding = ISO-8859-1
------------
+
If you do not have this configuration variable, the value of
`i18n.commitEncoding` is used instead.
Note that we deliberately chose not to re-code the commit log
message when a commit is made to force UTF-8 at the commit
object level, because re-coding to UTF-8 is not necessarily a
reversible operation.