comparison dep/anitomy/README.md @ 9:5c0397762b53

INCOMPLETE: megacommit :)
author Paper <mrpapersonic@gmail.com>
date Sun, 10 Sep 2023 03:59:16 -0400 (16 months ago)
parents
children
comparison
equal deleted inserted replaced
8:b1f73678ef61 9:5c0397762b53
1 # Anitomy
2
3 *Anitomy* is a C++ library for parsing anime video filenames. It's accurate, fast, and simple to use.
4
5 ## Examples
6
7 The following filename...
8
9 [TaigaSubs]_Toradora!_(2008)_-_01v2_-_Tiger_and_Dragon_[1280x720_H.264_FLAC][1234ABCD].mkv
10
11 ...is resolved into these elements:
12
13 - Release group: *TaigaSubs*
14 - Anime title: *Toradora!*
15 - Anime year: *2008*
16 - Episode number: *01*
17 - Release version: *2*
18 - Episode title: *Tiger and Dragon*
19 - Video resolution: *1280x720*
20 - Video term: *H.264*
21 - Audio term: *FLAC*
22 - File checksum: *1234ABCD*
23
24 Here's an example code snippet...
25
26 ```cpp
27 #include <iostream>
28 #include <anitomy/anitomy.h>
29
30 int main() {
31 anitomy::Anitomy anitomy;
32 anitomy.Parse(L"[Ouroboros]_Fullmetal_Alchemist_Brotherhood_-_01.mkv");
33
34 const auto& elements = anitomy.elements();
35
36 // Elements are iterable, where each element is a category-value pair
37 for (const auto& element : elements) {
38 std::wcout << element.first << '\t' << element.second << '\n';
39 }
40 std::wcout << '\n';
41
42 // You can access values directly by using get() and get_all() methods
43 std::wcout << elements.get(anitomy::kElementAnimeTitle) << L" #" <<
44 elements.get(anitomy::kElementEpisodeNumber) << L" by " <<
45 elements.get(anitomy::kElementReleaseGroup) << '\n';
46
47 return 0;
48 }
49 ```
50
51 ...which will output:
52
53 ```
54 12 mkv
55 13 [Ouroboros]_Fullmetal_Alchemist_Brotherhood_-_01
56 7 01
57 2 Fullmetal Alchemist Brotherhood
58 16 Ouroboros
59
60 Fullmetal Alchemist Brotherhood #01 by Ouroboros
61 ```
62
63 ## How does it work?
64
65 Suppose that we're working on the following filename:
66
67 "Spice_and_Wolf_Ep01_[1080p,BluRay,x264]_-_THORA.mkv"
68
69 The filename is first stripped off of its extension and split into groups. Groups are determined by the position of brackets:
70
71 "Spice_and_Wolf_Ep01_", "1080p,BluRay,x264", "_-_THORA"
72
73 Each group is then split into tokens. In our current example, the delimiter for the enclosed group is `,`, while the words in other groups are separated by `_`:
74
75 "Spice", "and", "Wolf", "Ep01", "1080p", "BluRay", "x264", "-", "THORA"
76
77 Note that brackets and delimiters are actually stored as tokens. Here, identified tokens are omitted for our convenience.
78
79 Once the tokenizer is done, the parser comes into effect. First, all tokens are compared against a set of known patterns and keywords. This process generally leaves us with nothing but the release group, anime title, episode number and episode title:
80
81 "Spice", "and", "Wolf", "Ep01", "-"
82
83 The next step is to look for the episode number. Each token that contains a number is analyzed. Here, `Ep01` is identified because it begins with a known episode prefix:
84
85 "Spice", "and", "Wolf", "-"
86
87 Finally, remaining tokens are combined to form the anime title, which is `Spice and Wolf`. The complete list of elements identified by *Anitomy* is as follows:
88
89 - Anime title: *Spice and Wolf*
90 - Episode number: *01*
91 - Video resolution: *1080p*
92 - Source: *BluRay*
93 - Video term: *x264*
94 - Release group: *THORA*
95
96 ## Why should I use it?
97
98 Anime video files are commonly named in a format where the anime title is followed by the episode number, and all the technical details are enclosed within brackets. However, fansub groups tend to use their own naming conventions, and the problem is more complicated than it first appears:
99
100 - Element order is not always the same.
101 - Technical information is not guaranteed to be enclosed.
102 - Brackets and parentheses may be grouping symbols or a part of the anime/episode title.
103 - Space and underscore are not the only delimiters in use.
104 - A single filename may contain multiple delimiters.
105
106 There are so many cases to cover that it's simply not possible to parse all filenames solely with regular expressions. *Anitomy* tries a different approach, and it succeeds: It's able to parse tens of thousands of filenames per second, with great accuracy.
107
108 The following projects make use of *Anitomy*:
109
110 - [Taiga](https://github.com/erengy/taiga)
111 - [MAL Updater OS X](https://github.com/chikorita157/malupdaterosx-cocoa)
112 - [Hachidori](https://github.com/chikorita157/hachidori)
113 - [Shinjiru](https://github.com/Kazakuri/Shinjiru)
114
115 See [other repositories](https://github.com/search?utf8=%E2%9C%93&q=anitomy) for related projects (e.g. interfaces, ports, wrappers).
116
117 ## Are there any exceptions?
118
119 Yes, unfortunately. *Anitomy* fails to identify the anime title and episode number on rare occasions, mostly due to bad naming conventions. See the examples below.
120
121 Arigatou.Shuffle!.Ep08.[x264.AAC][D6E43829].mkv
122
123 Here, *Anitomy* would report that this file is the 8th episode of `Arigatou Shuffle!`, where `Arigatou` is actually the name of the fansub group.
124
125 Spice and Wolf 2
126
127 Is this the 2nd episode of `Spice and Wolf`, or a batch release of `Spice and Wolf 2`? Without a file extension, there's no way to know. It's up to you consider both cases.
128
129 ## Suggestions to fansub groups
130
131 Please consider abiding by these simple rules before deciding on your naming convention:
132
133 - Don't enclose anime title, episode number and episode title within brackets. Enclose everything else, including the name of your group.
134 - Don't use parentheses to enclose release information; use square brackets instead. Parentheses should only be used if they are a part of the anime/episode title.
135 - Don't use multiple delimiters in a single filename. If possible, stick with either space or underscore.
136 - Use a separator (e.g. a dash) between anime title and episode number. There are anime titles that end with a number, which creates ambiguity.
137 - Indicate the episode interval in batch releases.
138
139 ## License
140
141 *Anitomy* is licensed under [Mozilla Public License 2.0](https://www.mozilla.org/en-US/MPL/2.0/FAQ/).