Quantcast
Channel: Active questions tagged jq - Stack Overflow
Viewing all articles
Browse latest Browse all 527

Run an external command within jq to manipulate each values of a particular key

$
0
0

I have a json like

[  {"url": "https://drive.google.com/file/d/1tO-qVknlH0PLK9CblQsyd568ZiptdKff/view?usp=share_link","title": "– Flexibility"  },  {"url": "https://drive.google.com/open?id=11_sR8X13lmPcvlT3POfMW3044f3wZdra","title": "– Pronouns"  }]

I got it using curl -Lfs "https://motivatedsisters.com/2019/07/08/arabic-review-sr-rahat-basit/" | rg -o '<li>.*?href="(.*?)".*?</a> (.*?)<\/li>' -r '{"url": "$1", "title": "$2"}' | jq -s '.'.

I have a command in my machine named unescape_html, a python scipt to unescape the html (replace – with appropriate character).

How can I apply this function on each of the titles using jq.

For example:

I want to run:

unescape_html "&#8211; Flexibility"unescape_html "&#8211; Pronouns"

The expected output is:

[  {"url": "https://drive.google.com/file/d/1tO-qVknlH0PLK9CblQsyd568ZiptdKff/view?usp=share_link","title": "– Flexibility"  },  {"url": "https://drive.google.com/open?id=11_sR8X13lmPcvlT3POfMW3044f3wZdra","title": "– Pronouns"  }]

Update 1:

If jq doesn't have that feature, then i am also fine that the command is applied on rg.

I mean, can i run unescape_html on $2 on the line:

rg -o '<li>.*?href="(.*?)".*?</a> (.*?)<\/li>' -r '{"url": "$1", "title": "$2"}'

Any other bash approach to solve this problem is also fine. The point is, i need to run unescape_html on title, so that i get the expected output.

Update 2:

The following command:

curl -Lfs "https://motivatedsisters.com/2019/07/08/arabic-review-sr-rahat-basit/" | rg -o '<li>.*?href="(.*?)".*?</a> (.*?)<\/li>' -r '{"url": "$1", "title": "$2"}' | jq -s '.' | jq 'map(.title |= @sh "unescape_html \(.)")'

gives:

[  {"url": "https://drive.google.com/file/d/1tO-qVknlH0PLK9CblQsyd568ZiptdKff/view?usp=share_link","title": "unescape_html '&#8211; Flexibility'"  },  {"url": "https://drive.google.com/open?id=11_sR8X13lmPcvlT3POfMW3044f3wZdra","title": "unescape_html '&#8211; Pronouns'"  }]

Just not evaluating the commands.

The following command works:

curl -Lfs "https://motivatedsisters.com/2019/07/08/arabic-review-sr-rahat-basit/" | rg -o '<li>.*?href="(.*?)".*?</a> (.*?)<\/li>' -r '{"url": "$1", "title": "$2"}' | jq -s '.' | jq 'map(.title |= sub("&#8211;"; "–"))'

But it only works for &#8211;. It will not work for other reserved characters.

Update 3:

The following command:

curl -Lfs "https://motivatedsisters.com/2019/07/08/arabic-review-sr-rahat-basit/" | rg -o '<li>.*?href="(.*?)".*?</a> (.*?)<\/li>' -r '{"url": "$1", "title": "$2"}' | jq -s '.' | jq -r '.[] | "unescape_html \(.title | @sh)"' | bash

is giving:

– Flexibility– Pronouns

So, it is applying the bash function. Now I need to format is so that the urls are come with the title in json format.


Viewing all articles
Browse latest Browse all 527

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>