Feature #1579

Portal 0.6.1: images of facs edition are not managed by the user account system

Added by Sebastien Jacquot about 4 years ago. Updated 6 months ago.

Status:New Start date:11/02/2015
Priority:High Due date:
Assignee:- % Done:

0%

Category:Edition Spent time: -
Target version:Portal 0.7

Description

At this moment, images of facs edition can only be accessible by publishing them in the public tomcat directory. So even if a corpora access is restricted through profiles, images remain accessible to all clients by pointing to their URL (or parent directory, according to the server configuration).

Solution 1

  • create some kind of getfile.jsp page that reads and serves the images stored outside the public tomcat space and only if the user is connected and have the right permissions on the corpora, eg. : getfile.jsp?img=185&corpora=DISCOURS

Solution 2

  • TBD (using some Java classes instead of JSP page ?)

Solution 3

  • directly embed raw data of image in HTML pages using data URI, eg.:
<img src="data:image/png;base64,iDFgdVRw0KGgoAAdfgANSUhEUgAAADIA..." />

Need to check the state of the art of browsers compatibility with this inline embed.
See: http://caniuse.com/#feat=datauri

Solution 4

Use an image server component dedicated to serving the images not especially limited to access control and file transfer. For example: pyramidal file transfer and zooming/pan strategy for big images, image annotation services, standard image metadata and annotations export, etc. eg digilib

The image server component can be embedded in the TXM portal software or can be an independent Tomcat service with connectors to the TXM portal.

The image server component and TXM must share at least corpora images informations, user identities and access control rules. For this we could open the TXM identities management component to an external identity component eg DS ldap (and connexion eg. Shibbolet) service, maybe access control rules also.

Temporary workaround

Temporary workaround 2 (or permanent XTZ option: "Embed raw images in facs edition")

  • backup your corpora before using this script that takes a TXM facs edition directory and create another one with datauri img@src base64 embed images from img@src URL in the source edition:
package main;

import java.awt.image.BufferedImage;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.ArrayList;

import javax.imageio.ImageIO;
import javax.xml.bind.DatatypeConverter;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

/**
 * Parses a TXM facs edition directory HTML files, looks for the img@src link
 * and replace it with inline base64 encoded image data (eg.: <img
 * alt="Embedded Image" 
 * src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADIA..." />).
 * 
 * Data URI web browsers support: http://caniuse.com/#feat=datauri
 * 
 * @author sjacquot
 *
 */
public class MainParseTXMFacsEditionAndEmbedImage {

    public MainParseTXMFacsEditionAndEmbedImage() {
        // TODO Auto-generated constructor stub
    }

    public static void main(String[] args) {

        // path of the source TXM facs edition containing HTML file with img@src URLs
        File srcEditionPath = new File("C:\\Tools\\Coding\\Java\\workspace_txm\\tests\\facs");

        // path to save the new facs edition HTML file with embedded images
        File targetEditionPath = new File("C:\\Tools\\Coding\\Java\\workspace_txm\\tests\\facs2");

        // errors
        ArrayList<File> fileErrors = new ArrayList<File>();
        ArrayList<String> errorMessages = new ArrayList<String>();

        System.out.println("Creating output directory " + targetEditionPath.getAbsolutePath() + ".");
        targetEditionPath.mkdirs();

        for(File f : srcEditionPath.listFiles()) {
            try {

                // skip directories
                if(f.isDirectory()) {
                    continue;
                }

                System.out.println("* Parsing file " + f.getAbsolutePath() + "...");

                File newOutputFile = new File(targetEditionPath.getAbsolutePath() + "\\" + f.getName());

                DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
                DocumentBuilder loader = factory.newDocumentBuilder();
                Document document = loader.parse(f.getAbsolutePath());

                NodeList nodes = document.getElementsByTagName("img");
                if(nodes.getLength() > 0) {
                    String src = nodes.item(0).getAttributes().getNamedItem("src").getNodeValue();

                    File imgFile = new File(f.getParent() + "\\" + src);

                    String imgExtension = "";
                    int i = imgFile.getName().lastIndexOf('.');
                    if (i > 0) {
                        imgExtension = imgFile.getName().substring(i+1);
                    }

                    System.out.println("Img tag found, src = " + imgFile.getAbsolutePath() + ". Opening image file and encoding data to base64...");

                    // load and encode image
                    ByteArrayOutputStream output = new ByteArrayOutputStream();
                    BufferedImage img = ImageIO.read(imgFile);
                    ImageIO.write(img, imgExtension, output);
                    String base64Data = DatatypeConverter.printBase64Binary(output.toByteArray());

                    nodes.item(0).getAttributes().getNamedItem("src").setNodeValue("data:image/" + imgExtension + ";base64," + base64Data);

                    //FIXME: debug
                    //System.out.println("base64 image data = " + base64Data + ".");

                    // save changes to file
                    System.out.println("Saving new file to " + newOutputFile.getAbsolutePath() + ".");
                    TransformerFactory transformerFactory = TransformerFactory.newInstance();
                    Transformer transformer = transformerFactory.newTransformer();
                    DOMSource source = new DOMSource(document);
                    StreamResult result = new StreamResult(newOutputFile);
                    transformer.transform(source, result);

                    System.out.println("File saved.");

                }
                else {
                    System.out.println("No img tag found, direct copying file " + f.getAbsolutePath() + " to " + newOutputFile.getAbsolutePath() + ".");
                    InputStream is = null;
                    OutputStream os = null;
                    try {
                        is = new FileInputStream(f);
                        os = new FileOutputStream(newOutputFile);
                        byte[] buffer = new byte[1024];
                        int length;
                        while((length = is.read(buffer)) > 0) {
                            os.write(buffer, 0, length);
                        }
                    }
                    finally {
                        is.close();
                        os.close();

                        System.out.println("Copy done.");

                    }
                }

            }
            catch(Exception e) {
                fileErrors.add(f);
                errorMessages.add(e.getMessage());
                e.printStackTrace();
            }
        }

        // Done with errors
        if(!fileErrors.isEmpty())    {
            System.err.println("!!! DONE WITH ERROR. Files: ");
            for(int i = 0; i < fileErrors.size(); i++) {
                System.err.println(fileErrors.get(i).getAbsolutePath());
                System.err.println(errorMessages.get(i));                
            }
        }
        // OK
        else    {
            System.err.println("DONE WITH NO ERRORS.");
        }

    }
}

History

#1 Updated by Sebastien Jacquot about 4 years ago

  • Description updated (diff)

#2 Updated by Sebastien Jacquot about 4 years ago

  • Description updated (diff)

#3 Updated by Alexey Lavrentev about 3 years ago

  • Tracker changed from Bug to Feature
  • Priority changed from High to Normal
  • Target version changed from Portal 0.6.2 to Portal 0.7

#4 Updated by Sebastien Jacquot almost 3 years ago

  • Description updated (diff)

#6 Updated by Sebastien Jacquot almost 3 years ago

  • Description updated (diff)

#7 Updated by Sebastien Jacquot almost 3 years ago

BuildFacsEditions.groovy could be easily modified by adding image reading and base 64 encoding here:

    private writeImg(String src) {
        pagedWriter.writeStartElement("div");

        // FIXME: test embed raw data of image in HTML page with data URI

        pagedWriter.writeEmptyElement("img", ["src":src, "width":"100%"]);
        pagedWriter.writeEndElement(); // </div>
    }

#8 Updated by Matthieu Decorde almost 3 years ago

  • Priority changed from Normal to High

#9 Updated by Serge Heiden almost 3 years ago

  • Description updated (diff)

#10 Updated by Sebastien Jacquot over 2 years ago

  • Description updated (diff)

#11 Updated by Sebastien Jacquot over 2 years ago

  • Description updated (diff)

#12 Updated by Sebastien Jacquot over 2 years ago

  • Description updated (diff)

#13 Updated by Sebastien Jacquot over 2 years ago

  • Description updated (diff)

#14 Updated by Sebastien Jacquot 6 months ago

Access problem is the same for css files that, at this moment, need to be in the public web directory.

Also available in: Atom PDF