Create a text file of markdown links within each section of a document

Is there a reason why Headings or HeadingsMap do not include a field with the number associated with the heading ? That would be really helpful, and is a pretty natural property… it just flows from the page structure.

Interesting idea, but could be tricky to get right. I’m not sure what your use case is, but the most likely usage would be in the heading render hook such that this:

## Foo
### Bar
### Baz

renders this:

<h2 id="foo">1 Foo<h2>
<h3 id="bar">1.1 Bar<h3>
<h3 id="baz">1.2 Baz<h3>

But then the tricky parts:

  • What if the first heading on the page is h1 or h3?
  • What if the author skips a level (e.g., h2 followed by h4)?
  • What about headings inserted via shortcodes using the {{% .. %}} notation?
  • Should the element id include the number?
  • Wouldn’t it need to be configurable? (see below)

This form of numbering is common in contracts, licenses, etc., but the styles are not standard. For example…

<h2 id="foo">1 Foo<h2>
<h3 id="bar">1.1 Bar<h3>
<h3 id="baz">1.2 Baz<h3>

or 

<h2 id="foo">(1) Foo<h2>
<h3 id="bar">(1a) Bar<h3>
<h3 id="baz">(1b) Baz<h3>

etc.

So, an interesting idea, but hard to get right. Ask anyone who’s struggled with MS Word or similar in this area.

If the numbering were purely for display purposes (i.e., not part of the .Fragments object), you could probably do this with CSS.

1 Like

It’s not for display purpose in my case, as I use, indeed, CSS whenever possible. I want the relative order between successive headings, to construct a bibliography. Look here. So it’s not so much the number than just the order of headings, that I need. The difference a set, and a sequence, in data structures.

Could we just add[1] .Ordinal to the context received by the heading render hook, or do you need it in the .Fragments object?


  1. “just add” makes this sound like a trivial effort, which it isn’t ↩︎

for use outside the renderhook… I vote for the dedicated objects.
in single.csv, I would iterate recursively through the map, and write references whenever they’ve been mentioned at that place in the text. Way easier method than anything I could come up with now… If I had this order.

So the end goal is a list of markdown links within each section, written to an alternate output format?

1 Like

Yep, exactly. In order of appearance. If only I could knew how to express myself as precisely as you do !

@davidsneighbour suggested an approach using a combination of heading and link render hooks, roughly demonstrated by:

git clone --single-branch -b hugo-forum-topic-48875 https://github.com/jmooring/hugo-testing hugo-forum-topic-48875
cd hugo-forum-topic-48875
hugo server

But instead of using the HTML render hooks, you’d create render hooks for the CSV output format.

Wait, in that repo I can know under which heading a link is, yes, but I fail to see how I establish the order between each link. This template outputs html, so the order is apparent… it’s just written one after the other, but it’s not available when I want to get it from another template, single.csv for instance. I don’t know if I express myself well ?

As I understand this method will only associate a heading, links and an ID. but can’t write headings in sequence outside of the render-heading.html.

I can generate an alternate output format like this (JSON):

[
  {
    "destination": "https://example.org/link-1",
    "heading": "Section 1"
  },
  {
    "destination": "https://example.org/link-2",
    "heading": "Section 1"
  },
  {
    "destination": "https://example.org/link-3",
    "heading": "Section 2"
  },
  {
    "destination": "https://example.org/link-4",
    "heading": "Section 2"
  }
]

If this were converted to CSV would it be what you want? And would it be one CSV file per content page, or one CSV file that contains all links for all pages (i.e., would there need to be a third field with the page URL)?

Have a look at the revision:

git clone --single-branch -b hugo-forum-topic-48875 https://github.com/jmooring/hugo-testing hugo-forum-topic-48875
cd hugo-forum-topic-48875
rm -rf public && hugo && cat public/index.csv

Files of interest:

  • hugo.toml
  • layouts/_default/_markup/render-heading.html
  • layouts/_default/_markup/render-link.html
  • layouts/_default/home.csv

See the following:

### titre A
[n o matter the .Text](https://google.com/link-A "reference 1")
[BLABLABLA](https://google.com/link-2 "reference 2")
## subtitle 1
[SDKLFJSDMLKJF](https://google.com/h "reference 3")
[DSFSD](https://google.com/1 "reference A")
## skip if without without links
## titre B 
##### titre 2
[DFSDFSD](https://google.com/link-a "reference 4")
[link-4](https://google.com/link-b "reference B")
### subtitle A
##### subtitle 1
[in case there is no .Title attribute](https://google.com/link-6)
[link-2 sub 2](https://google.com/link- "reference1")

it should result in:

### titre A
reference 1 : https://google.com/link-A
reference 2 : https://google.com/link-2
## subtitle 1
reference 3 : https://google.com/h
reference A : https://google.com/1
## B
##### titre 2
reference 4 : https://google.com/link-a
reference B : https://google.com/link-b
### subtitle A
##### subtitle 1
in case there is no .Title attribute : https://google.com/link-6
reference1 : https://google.com/link-

so that would be:

  • one per page, yes.
  • recopy the exact links order and heading hierarchy as is, but skips headings which there is no link and that that does not includes subheadings that have links… empty branches so to speak.
  • use .Title instead of .Text when possible (I can do that…)

The only thing your exemple really lacks is that only the “leaf” headings appear, but not their ancestors:

"https://example.org/posts/post-1/","titre A","https://google.com/link-A"
"https://example.org/posts/post-1/","titre A","https://google.com/link-2"
"https://example.org/posts/post-1/","subtitle 1","https://google.com/h"
"https://example.org/posts/post-1/","subtitle 1","https://google.com/1"

here should be "## B"

"https://example.org/posts/post-1/","titre 2","https://google.com/link-a"
"https://example.org/posts/post-1/","titre 2","https://google.com/link-b"

here should be "### subtitle A"

"https://example.org/posts/post-1/","subtitle 1","https://google.com/link-6"
"https://example.org/posts/post-1/","subtitle 1","https://google.com/link-"

I think the last revision to the test branch handles this.

So you want a CSV file with records that don’t have fields?

I don’t understand you. Was my phrasing terrible ? People have been telling me so last days.
subtrees of the heading tree that do not have records/links should not appear, so “skip if without without links” should not appear.

It sounds like you want this CSV file:

## S1
### S1.1
heading,link
heading,link

That’t not a CSV file.

Yes, exactly ! Well there is really no csv standard… txt is fine. a data file.

And you wonder why people get frustrated when trying to help you.

1 Like

Let’s do this again…

git clone --single-branch -b hugo-forum-topic-48875 https://github.com/jmooring/hugo-testing hugo-forum-topic-48875
cd hugo-forum-topic-48875
rm -rf public && hugo && cat public/posts/post-1/bib.txt

Files of interest:

  • hugo.toml
  • layouts/_default/_markup/render-heading.html
  • layouts/_default/_markup/render-link.html
  • layouts/_default/single.bib.txt

The output (bib.txt) is:

## Section 1
### Section 1.1
Title of Link 1 : https://google.com/link-1
Title of Link 2 : https://google.com/link-2
### Section 1.2
Title of Link 3 : https://google.com/link-3
Title of Link 4 : https://google.com/link-4
## Section 2
### Section 2.1
### Section 2.2
## Section 3
link-5 : https://google.com/link-5
link-6 : https://google.com/link-6

This includes heading branches that do not contain links. Omitting those branches is an expensive and complicated exercise, requiring a recursive partial and maybe a separate data structure. That exercise is left to the reader.

2 Likes

I’ll go with this, many thanks. But what does $_ do ? It appears only once in single.bib.txt and nowhere else, nor in the documentation.

TLDR: It signifies that we won’t use this variable.


http://localhost:1313/templates/introduction/#variables

Variable names begin with a $, followed by one or more underscores, letters, or digits. Although allowed, I would discourage a $ followed by a digit (e.g., don’t use $123) in order to comply with Go’s identifier specification.

It is a common construct in Go to prevent variable initialization by using _. For example, in a Hugo template, if we know we won’t use the keys when ranging over a map:

{{ range $_, $v := $map }}
  {{ $v }}
{{ end }}

Or if we plan to use the key, but would prefer to access the values via dot notation:

{{ range $k, $_ := $map }}
  {{ $k }} = {{ . }}
{{ end }}

This syntax not a requirement, but I find it useful to reduce visual noise. As far as I know it does not prevent variable initialization as it does in Go, but I could be wrong about that.