Remix.run Logo
memset 5 days ago

The way this analysis, and the original dataset were created, makes no sense. This is, in part, not the author's fault, since the original data [1, 2] is flawed.

First, the original data was constructed like this: "...The next step was to format the raw HTML files into the full chord progression of each song, collapsing repeating identical chords into a single chord (’A G G A’ became ’A G A’)..."

Already this makes no sense - the fact that a chord is repeated isn't some sort of typo (though maybe it is on UltimateGuitar). For example, a blues might have a progression C7 F7 C7 C7 - the fact that C7 is repeated is part of the blues form. See song 225 from the dataset, which is a blues:

A7 D7 A7 D7 A7 E7 D7 A7

Should really be:

A7 D7 A7 A7 D7 D7 A7 A7 E7 D7 A7 A7

With these omissions, it's a lot harder to understand the underlying harmony of these songs.

The second problem is that we don't really analyze songs so much by the chords themselves, but the relationships between chords. A next step would be to convert each song from chords to roman numerals so we can understand common patterns of how songs are constructed. Maybe a weekend project.

[1] https://arxiv.org/pdf/2410.22046 [2] https://huggingface.co/datasets/ailsntua/Chordonomicon/blob/...

zenogantner 5 days ago | parent | next [-]

The problem with collapsed repeated chords comes not only from the data processing -- most Ultimate Guitar songs are written down entirely ignoring how often a chord is repeated -- the classic "lyrics plus chords" format is incomplete and requires the player to somewhat know the structure of the song anyway. The write-up usually just gives hints where, relative to the lyrics, the chord changes.

moefh 4 days ago | parent [-]

Exactly. In my experience, it's not just Ultimate Guitar, all of these sites with chord progressions assume you already know how the music sounds. They're not enough for someone to lean a song having never heard it, so they're almost certainly not enough to automate analysis of the chord progressions.

b800h 5 days ago | parent | prev | next [-]

I agree with you to some extent, but I'm also alive to the problem of how you achieve what you're talking about when chords can change at any point in a bar.

volemo 5 days ago | parent | prev | next [-]

Could you explain the Roman numerals part?

Twirrim 5 days ago | parent | next [-]

By convention in music, we use Roman numerals to signify what chord we should play relative to the root (key). "I" refers to the root/tonic/key and we count up from there. [1]

So, for example, a common three chord progression in a major scale would be I – IV – V. If we take the key of C, those would be C, F, G, as F and G are the fourth and fifth chords respectively.

In the key of G, it'd be G, C and D. In that key, a good example song is "Sweet Home Alabama", where almost the entire song is just V - IV - I over and over again.

One of the most popular chord progressions, used in an astounding number of pop songs is known as the "Four Chord Trick", I – V – VI – IV, famously demonstrated by the Aussie comedy band Axis of Awesome[2]

I think I'd agree with the person you're replying to, both in that the original source is flawed due to not including the "dupes", despite them being important, and also because key is largely irrelevant, chord progression is much more important.

[1]https://en.wikipedia.org/wiki/Roman_numeral_analysis [2]https://www.youtube.com/watch?v=oOlDewpCfZQ.

cpelletier 5 days ago | parent | next [-]

Minor chords are written in lowercase so the Axis of Awesome progression should be I-V-vi-IV

james_marks 4 days ago | parent | prev [-]

This is about the simplest description of chord progressions you're going to find.

There is something peculiar that people who understand music theory tend to have a difficult time explaining it without stacking concepts and new terms.

While I'm sure those concepts are necessary for completeness, to a beginner in becomes a brick wall, and this is blessedly direct compared to, to e.g., the linked wikipedia entry.

zzo38computer 5 days ago | parent | prev | next [-]

The number "I" means the chord from the first note of the scale (e.g. C E G in C major, or F A C in F major), and uppercase means major and lowercase means minor. Other numbers will then be e.g. "V" will be G B D in C major. You may then add digits as well in which case they indicate intervals above the bass, e.g. "V6" is a first inversion chord (e.g. B D G in C major) and "V7" adds the seventh (e.g. G B D F in C major).

slater- 4 days ago | parent [-]

You're talking about figured bass, which is its own type of notation.

"V6" to a jazz player would not indicate first inversion, it would be a major triad (built from the 5th of the tonic scale) with the addition of its own 6th scale degree. "V7" would include the dominant 7th (as opposed to the major 7th), "V13" would have the dominant 7th and also the 6th. Inversions aren't specified, the voicings are left up to the player.

zenogantner 5 days ago | parent | prev [-]

Typically, chord progressions are described independently of the key they are in.

For example: https://en.wikipedia.org/wiki/%2750s_progression

heavymetalpoizn 5 days ago | parent | prev [-]

[dead]