Decommissioned – As the .org site moved to WP the plugin no longer works and is no longer supported.
Brief
Build an import script to scrape all amnesty.org content documents in any given language eg Persian https://www.amnesty.org/en/search/?q=&ref=&year=&lang=fa&adv=1&sort=relevance
For each page found the content should be copied to the new site eg pages like this https://www.amnesty.org/en/documents/mde13/010/2009/fa/. The title, original URL, amnesty index number, except, date and Persian file should be copied.
The PDF should also be copied and linked to, for example. https://www.amnesty.org/download/Documents/32000/mde130602011fa.pdf
The topics should also be translated if a translation exists, migrated and if the category already exists then mapped to the translated category in the CMS.
Note:
- A bulk import will be done once by the development team, this cannot be done in the CMS.
- Subsequent imports can be done by the website CMS users by navigating to a page in the CMS and pasting a single amnesty.org document URL, then the content of that URL will be migrated.
- Topics on .org are migrated to categories on the Iran site. Topics have no hierarchy where categories do have hierarchy, therefore parent and child categories should have unique names, in order to map when migrating.
- AI would get a list of all taxonomy items from .org in English from I&T
- AI requesting team would get them translated
- BigBite would upload these translations to the site
- BigBite would run the script for the first time
- This would migrate all the content as discussed (the title, index number, date, excerpt and the copy of the PDF, topic names).This would also translate the taxonomy names based on the supplied translations
- AI requesting team may create a structure to the taxonomy names in the CMS, ie you may wish to create parent category for issues, then put all issues as a children of that parent.
- AI requesting team will then have an option in the CMS to paste a URL, ie for when you add new content to AIDAN, the script will then copy this content.