Skip to main content
Ben Nadel at InVision In Real Life (IRL) 2019 (Phoenix, AZ) with: Nick Miller
Ben Nadel at InVision In Real Life (IRL) 2019 (Phoenix, AZ) with: Nick Miller

Dynamically Loading Java Classes From JAR Files Using CreateObject() In Lucee 5.3.2.77

By on
Tags:

Yesterday, I took a look at parsing HTML natively in Lucee 5.3 using the htmlParse() function. The htmlParse() function returns an XML document, which isn't the easiest data-format to work with. But, for small, predictable inputs, using htmlParse() and some XPath is an attractive option. For example, on my blog, I now author my posts using Flexmark and Markdown. And, once the HTML content is generated from the Markdown, I need to extract the fenced-code blocks so that I can push them up to GitHub's Gist repository. This use-case is actually a perfect opportunity to explore both the htmlParse() function and the ability to dynamically load Java Classes using JAR files in Lucee 5.3.2.77.

In Adobe ColdFusion, in order to load the Flexmark Java library, I've been using JavaLoader. The JavaLoader project creates an isolated Class Loader that can instantiate Java classes based on a given set of JAR file paths. With Lucee 5.3, we can more-or-less replace the JavaLoader project with the native createObject() function.

When using the createObject() function to create objects of type, java, we can supply a collection of JAR file paths as an optional 3rd argument. If we do this, Lucee will load the classes out of the given JAR files instead of looking in the JAR files that are configured to be loaded by the Lucee CFML server at start-up:

createObject( "java", "some.java.Class", [ "path/to/class.jar" ] )

CAUTION: My equating of the JavaLoader project to this new functionality in Lucee 5.3 is based purely on a few comments that I've seen in the Lucee message groups. See here and here. The Lucee documentation does not articulate the createObject() function in this way - at least not as far as I understand it, given my novice-level insights into Java.

To see this in action, I'm going to load the Flexmark Java library in order to parse the following Markdown content. Then, I'm going to use the htmlParse() function to extract the embedded code blocks:

**Functions** in JavaScript are _awesome_. You can define them as Function
Declarations, which are hoisted and can be used ahead-of-time:

<div data-gist-filename="snippet-1.js" class="code">

```js
console.log( hoistedFunction() );

function hoistedFunction() {
	return( "Woot!" );
}
```

</div>

But, you can also define them as Function Expressions:

<div data-gist-filename="snippet-2.js" class="code">

```js
var functionExpression = function() {
	return( "Double woot!" );
};

console.log( functionExpression() );
```

</div>

JavaScript is the _bee's knees_!

As you can see, when I embed fenced code-blocks in my blog article content, I am surrounding them in a DIV that identifies the file name of the code to be used when it is saved as a GitHub Gist. As part of the post-processing of the blog content, I query the HTML for these DIV nodes and then pick out the content that lives between the generated <CODE></CODE> tags.

Right now, I do this with Regular Expressions; but, as you'll see below, I could be using the htmlParse() function:

<cfscript>

	markdownContent = fileRead( "./blog.md" );

	// First, convert the blog-entry markdown to HTML.
	// --
	// WHAT IS COOL: The markdownToHtml() function is loading the Flexmark Java library
	// under the hood. And, it's using explicitly-provided JAR files to do so. This
	// allows Java libraries to be consumed using an isolated loader that doesn't
	// conflict with the core JAR files that ship with Lucee.
	// --
	// CAUTION: Much of the above statement is based on ASSUMPTIONS about the value-add
	// of loading Java libraries in this way. There is little documentation on the
	// feature. As such, I am ASSUMING it is a replacement for the JavaLoader() project.
	// See here: https://dev.lucee.org/t/javaloader-and-lucee-5/3035
	// See here: https://dev.lucee.org/t/solr-7-extension/3002/10
	htmlContent = markdownToHtml( markdownContent );

	// Now that we have the generated HTML, extract the content of our fenced code-blocks
	// so that we can push them up to GitHub and create gists.
	// --
	// WHAT IS COOL: The extractGists() function is using Lucee's native htmlParse()
	// function to parse the HTML into an XML document. We then use XPath to query the
	// XML doc and retrieve the Gist content.
	gists = extractGists( htmlContent );

	dump( label = "Extracted Gist Data", var = gists );

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	/**
	* I parse the given HTML content and extract the fenced code-blocks. Each code-block
	* is returned in a Struct with its "name" and "content".
	* 
	* @content I am the HTML content being inspected.
	* @output false
	*/
	public array function extractGists( required string content ) {

		var htmlDoc = htmlParse( content );

		// The htmlParse() function returns an XML document with names-spaces. As such,
		// we can search the document using XPath; however, we have to use the local-
		// name() function since naked element selectors won't work.
		var gists = htmlDoc
			.search( "//*[ @data-gist-filename ][ @class = 'code' ]" )
			.map(
				( node ) => {

					return({
						name: node.xmlAttributes[ "data-gist-filename" ],
						content: node.search( "string( .//*[ local-name() = 'code' ]/text() )" )
					});

				}
			)
		;

		return( gists );

	}


	/**
	* I parse the given Markdown content into HTML content.
	* 
	* @content I am the markdown content being parsed.
	* @outupt false
	*/
	public string function markdownToHtml( required string content ) {

		// In order to use Flexmark, we are going to tell Lucee where our Flexmark JAR
		// files are located.
		var jarFiles = [
			expandPath( "./flexmark-all-0.42.12-lib.jar" )
		];

		// As a short-hand, create a Function Expression that proxies the createObject()
		// function and targets the Flexmark JAR files.
		var load = ( className ) => {

			return( createObject( "java", className, jarFiles ) );

		};

		// Load the necessary Flexmark Java Classes using the provided JAR paths.
		var HtmlRendererClass = load( "com.vladsch.flexmark.html.HtmlRenderer" );
		var ParserClass = load( "com.vladsch.flexmark.parser.Parser" );
		var options = load( "com.vladsch.flexmark.util.options.MutableDataSet" ).init();

		// Create our parser and renderer - both using the options.
		var parser = ParserClass.builder( options ).build();
		var renderer = HtmlRendererClass.builder( options ).build();
		
		// Parse the markdown into an Abstract Syntax Tree (AST) document node.
		var markdownAST = parser.parse( javaCast( "string", content ) );
		
		// Render the AST document into an HTML string.
		return( renderer.render( markdownAST ) );

	}

</cfscript>

As you can see, in the markdownToHtml() function, I am using the createObject() function to dynamically load the Flexmark Java libraries such that I can instantiate the Markdown parser and renderer. Then, in the extractGists() function, I am using the htmlParse() function along with some simple XPath to query the generated HTML content for my embedded code blocks.

You can also see that I am enthusiastically embracing the fat-arrow function syntax support in Lucee 5.3. What an exciting time to be alive!

That said, if we run this Lucee CFML page, we get the following output:

Fenced code-block content extracted from HTML using createObject() and htmlParse() in Lucee 5.3.2.77.

As you can see, I was able to use the createObject() function to dynamically load the Flexmark Java library. Then, I was able to use the htmlParse() function, along with some XPath, to parse the resultant HTML and extract the fenced code-block content for my subsequent GitHub Gist creation (not part of the demo).

Being able to dynamically load Java libraries from JAR file paths in Lucee 5.3.2.77 is very exciting! The extensibility of ColdFusion - on top of Java - has always been a huge value-add; but now, with this augmented createObject() behavior, it becomes a seamless, modular workflow.

Want to use code from this post? Check out the license.

Reader Comments

10 Comments

Nice example, Ben.

I really like having this option in Lucee 5, but unfortunately "dynamic" only applies to loading jars, not updating them (i.e. reloading them with a different version).

Also, you may get version clashes if you try to load a library that's already in the Lucee core (I found this when I tried to load a newer version of Apache Commons Compress). I thought the OSGi architecture was supposed to prevent this, but apparently not.

In other words, if you want to be able to update a jar while Lucee is running, or load a different version of a library that Lucee has already loaded in its core then you'll need JavaLoader so that the jar is fully dynamic and isolated.

Shame as I would dearly love to get rid of JavaLoader as a dependency (no offence to the brilliant Mark Mandel who gave us an incredible tool!)

More detail on these tickets:

https://luceeserver.atlassian.net/browse/LDEV-1528
https://github.com/cfsimplicity/lucee-spreadsheet/issues/148

15,674 Comments

@Julian,

UGGGG. I was really nervous about that, as you can tell from the CAUTION I put at the top which was basically, "I don't really know what I'm saying here" :P I had hoped this functionality was basically the same. But, there wasn't much documentation on it; and, I couldn't really understand the low-level calls when I looked in GitHub at the implementation.

That is a shame. Like you, I was really excited about a native way to drop a dependency (no offense to the dependency itself).

61 Comments

On the OSGi architecture, in order to use a different version of something that is part of the lucee core (or where you have already loaded an older version previously) you need to convert the jar file to an osgi bundle which will contain a manifest file specifying the version number of that bundle. Then take that osgi bundle and drop it into the "/bundles" directory, then you can specify the desired version number in the createObject call.

Otherwise just using create object for a straight jar file will risk version clashes because you are essentially doing the java loading the old school non osgi way.

Ref:
https://docs.lucee.org/guides/lucee-5/osgi.html
https://dev.lucee.org/t/how-do-i-convert-an-existing-jar-file-into-an-osgi-bundle/374

10 Comments

@steven Thanks for clarifying the OSGi aspect. Obviously having to create a bundle to achieve the isolation means the process isn't dynamic - although I wonder if it could be made so (my java knowledge is very limited).

Somehow, thanks to the brilliance of Mark Mandel, JavaLoader is able to overcome both limitations to achieve truly dynamic and isolated loading despite doing things "the old school non OSGi way".

426 Comments

What a shame, it doesn't dynamically update. You would have thought by now, all those clever Java engineers building ACF/Lucee could have integrated a JavaLoader version into the core.

Just one thing Ben. Every time you call:

markdownToHtml()

You have to reload the java libraries again. I've always wanted to know how much this affects performance?

15,674 Comments

@Charles,

I can't be sure, but I assume - or I hope - that some caching is being implemented under the hood. Something that maps the file-paths onto some internal cache-key.

@All,

After looking at everyone's comments, it feels like I might as well just keep using the JavaLoader project. It is really easy to configure, and doesn't require any extra steps. Unless I'm missing some really obvious advantage of the OSGi approach? I'm not really a "Java guy", so the technical implications are not obvious to me.

3 Comments

I was having a real tough time getting this to load, and I did not want to go the OSGi route. I was able to finally dial in the right combination, and hopefully my explanation will make it more simple for others to understand how to quickly and easily dynamically load a jar file in Lucee.

I was able to load the ftp4j-1.7.2.jar file by placing it in the C:\lucee\tomcat\webapps\ROOT\WEB-INF\lucee\lib\ directory, and in a cfscript I put:

ftpClient = CreateObject("java","it.sauronsoftware.ftp4j.FTPClient", "C:/lucee/tomcat/webapps/ROOT/WEB-INF/lucee/lib/ftp4j.jar").init();

Voila! A simple dump(ftpClient); after that gave me access to all the PEM's.

Part of the trick is finding that "it.sauronsoftware...." part; find the manual for your class and search for it.

I did NOT need to load the java classpaths in Application.cfc either, which was good because this is only used by less than one percent of the application and only by certain people, so I didn't need to add to the memory usage for the rest of the community.

15,674 Comments

@Randy,

Agreed, finding the actual "class path" stuff can be the most complicated part of all of this, especially if (like me) you don't have a Java background. In ColdFusion, I'm used to using the file-path as the thing you load. But, Java, they throw that package stuff in there, which is / can be completely different from the filepath. There's nothing quite as frustrating as the "Class not found" errors when instantiating Java objects :D

3 Comments

@Ben,

Totally agree -- and it doesn't help that most who post a "solution" only post an esoteric example, not actual working code or something that you can really sink your teeth into.

Often, they even use a variable name so similar or exactly like the function name that you're left scratching your head wondering which is which -- and I see I did the same thing I'm ranting about. D'oh!

Better would have been even a slight change, like this (added an underscore to differentiate the "ftpclient" references here):

ftp_Client = CreateObject("java","it.sauronsoftware.ftp4j.FTPClient", "C:/lucee/tomcat/webapps/ROOT/WEB-INF/lucee/lib/ftp4j.jar").init();

dump(ftp_Client);

For some, more precisely, where did I find the "it.sauronsoftware" class path? It sure was not right out there in the open, no sirree, although it WAS in the manual -- which took a little digging to find (both the manual and the reference). You would think this important information, needed for creating an object or for importing, would be a little more prevalent in the documentation.

Ben, keep up the good fight, and thanks for all you do to help us out!

15,674 Comments

@Randy L.,

Ha ha, re:

Often, they even use a variable name so similar or exactly like the function name that you're left scratching your head wondering which is which -- and I see I did the same thing I'm ranting about. D'oh!

I've actually been called-out for this on a few PRs at work. Yo, naming stuff is hard! It can be especially confusing when the difference is moving the location of a d or something, like:

var normalizedFilter = normalizeFilter( input );

... notice the d in normalzed in the variable name. I don't know how to get around this kind of stuff :D This comes up more often than I would like, and I still don't have a great strategy for dealing with it.

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel