JavaSE | New posts here >>> rmannibucau.metawerx.net

BatchEE + ApplicationComposer = Batch.main(args)

Batches are quite particular as application cause you have generally the choice with keeping a JVM running and using a scheduler to trigger them or to just use an external scheduler to trigger the JVM and once the batch is done just exit the process.

The choice can depend on several criteria but most of the time in medium/big companies this choice is in place when you need to write a batch you just need to respect it.

Where having a long time running container is not different than any application, writing a main can be quite challenging.

Let’s see how OpenEJB (TomEE) and BatchEE makes it easy.

BatchEE is a JBatch implementation but provides a lot of tooling on top of JBatch like a management GUI, some maven plugins, a lot of extensions/components (csv, flr, json readers, chained processors, typed components etc…) and much more. One of the proposed tooling is to setup BatchEE “as” a container and run a web application (war) or a batch application (bar, format created by BatchEE dedicated to batches). This solution is nice since it is integrated with some well known containers like OpenEJB, Spring or any CDI embedded container (OpenWebBeans, Weld).

However the pitfalls of such a solution is to suppose you can deploy a “batch container” – BatchEE standalone – and your application. This is not always possible and the “standard” for your company can be to just deploy a jar.

The simplest solution is to write a (J)Batch relying only on SE features and do a shade but then you loose all the EE programming model which is very beneficial to write profesionnal and maintainable batches.

To be able to mix both you need to be able to wrap your batch execution once a container is launched.

Using the whole BatchEE ecosystem there are several solutions but a simple one I used several times with success is to use OpenEJB/TomEE ApplicationComposer to handle the container integration and just BatchEE Batches helper to wait for the end of the execution. One nice bonus with ApplicationComposer is that you fully control the classes part of your application which often allows to make your batch deployment lighter.

For our example we will suppose we have a batch with a single step which is a chunk. It will have the following chain:


fileReader -> entryProcessor -> httpWriter

Of course all these components are CDI beans.

Step 1: define your application

The first step is to define an ApplicationComposer model. This post is not about ApplicationComposer itself so let’s make it simple and define it like that:

public class SampleApplication {
    @Module
    @Classes(cdi = true, value = {
        FileReader.class, EntryProcessor.class, HttpWriter.class, HttpClient.class
    })
    public WebApp application() {
        return new WebApp();
    }
}

Of course in previous sample HttpWriter get injected HttpClient so if you need more CDI classes you need to register them (or use @Jars/@Default).

Step 2: execute your batch

Executing a batch is as simple as obtaining a JobOperator, starting your batch and waiting for its end.

If you use BatchEE, the main module provides Batches class which has this feature. The implementation either uses batchee internal to wait for the end or use a simple polling on the batch status.

In practise it means starting and waiting for the end of the batch looks like:

protected JobExecution runBatch(final String batchName, final Properties config) {
    final JobOperator operator = BatchRuntime.getJobOperator();
    final long id = operator.start(batchName, config);
    Batches.waitForEnd(operator, id);
    return operator.getJobExecution(id);
}

Step 3: ensure BatchEE is initialized properly

I already spoke about how to enforce BatchEE BeanManager initialization properly on this blog but in fully embedded mode and with OpenEJB 4.x where BatchEE is not integrated (as opposed to OpenEJB/TomEE 7) we need another step: enforce the initialization of the BeanManager and the TransactionSynchronizationRegistry.

Note: of course if you don’t care about JTA you only need the BeanManager integration.

The easiest is to use a @Singleton to do so:

@Startup
@Singleton // batchee spawns its own thread so ensure EE is init in an EJB with all EE instances bound
public class BatchEEBeanManagerInitializer {
    @PostConstruct
    private void forceInit() {
        try {
            // enfore init of jta tx registry in this ejb context
            Class.forName(Synchronizations.class.getName(), true, Thread.currentThread().getContextClassLoader());

            // enfore cdi bean manager capture
            ServicesManager.find().service(BatchArtifactFactory.class).load("init");
        } catch (final Exception e) {
            throw new IllegalStateException(e);
        }
    }

    @Named
    protected static class Init {
    }
}

Of course we need to add this class in our SampleApplication.

Step 4: handle the container lifecycle

So we have BatchEE properly setup, our application defined and our batch ready to be executed so we just miss a way to start the container. This is where ApplicationComposer does its job:

final SampleApplication app = new SampleApplication();
new ApplicationComposers(app).evaluate(app, () -> {
    // run your batch here
});

And here we are, we have all the steps to do our shade!

Step 5: wait we need a main!

When you have a single batch you can put it all together in a main(String[]) but it happens quite regularly to put several batches in the same “application” because they are close in term of code or just linked in term of lifecycle.

In this case using a plain main means handling the batch selection yourself. It also makes the wiring of the batch parameters harder – as you can guess since my chunk starts reading a file the file path will likely be a parameter you can pass to the command line.

Said more concretely it means the need of a “CLI” library is quickly obvious. Of course and without surprise I will use Tomitribe Crest but Args4j, Airline for instance are very good alternatives if you are used to them.

The nice thing putting all your launchers in a “command” class is you can share the reporting of the batches (it is important to report the execution time, batch status etc…). Also if your scheduling tool uses exit code to determine if the execution was successful or not it is a good place to share the exit code logic.

Here is what it looks like with Crest for our ApplicationComposer BatchEE batch:

public class Batch {
    @Command
    public static void sample(@Option("input") final String input,
                              @Out final PrintStream printStream) {
        of(new SampleApplication()).ifPresent(a -> run(printStream, a, () -> a.execute(input)));
    }

    private static void run(final PrintStream printStream, final Object app, final Supplier<JobExecution> report) {
        try {
            new ApplicationComposers(app).evaluate(app, () -> { // in container
                final JobExecution execution = report.get(); // run the batch

                // log the execution
                final String executionString = "Job name: " + execution.getJobName() + lineSeparator() +
                    "Execution Id: " + execution.getExecutionId() + lineSeparator() +
                    "Create Time: " + execution.getCreateTime() + lineSeparator() +
                    "Execution Time: " + TimeUnit.MILLISECONDS.toSeconds(
                    of(execution).map(JobExecution::getEndTime).map(Date::getTime).orElse(0L) - execution.getCreateTime().getTime()) + "s" + lineSeparator() +
                    "Exit Status: " + execution.getExitStatus() + lineSeparator() +
                    "Batch Status: " + execution.getBatchStatus();
                printStream.println(executionString);

                // exit status will be -100 if status is failing, -1 if an unexpected error occured and 0 if it succeed
                if (execution.getBatchStatus() != BatchStatus.COMPLETED) {
                    throw new FailedExecution(executionString, execution);
                }

                return null;
            });
        } catch (final RuntimeException re) {
            throw re;
        } catch (final Exception e) {
            throw new IllegalStateException(e);
        }
    }

    @Exit(-100)
    @Getter
    public static class FailedExecution extends CommandFailedException {
        private final JobExecution execution;

        public FailedExecution(final String executionString, final JobExecution execution) {
            super(new IllegalStateException(lineSeparator() + executionString), execution.getJobName());
            this.execution = execution;
        }
    }
}

Things to note there:

Exit code handling uses @Exit handling of crest
we added an execute method to our application which is responsible to wire command parameter to the batch
run method is generic for any batch command and handles the reporting of the batch execution and makes it returning -100 exit code if the batch failed

Side note on exit status: on some UNIx system using negative exit codes make it moving to the positive side cause it uses unsigned int (-100 becomes 156 for instance).

For completeness here is the SampleApplication with the full code:

public class SampleApplication {
    @Module
    @Classes(cdi = true, value = {
        BatchEEBeanManagerInitializer.class,
        FileReader.class, EntryProcessor.class, HttpWriter.class, HttpClient.class
    })
    public WebApp application() {
        return new WebApp();
    }

    public JobExecution execute(final String input) {
        return runBatch("sample", new Properties() {{
            setProperty("input", input);
        }});
    }

    protected JobExecution runBatch(final String batchName, final Properties config) {
        final JobOperator operator = BatchRuntime.getJobOperator();
        final long id = operator.start(batchName, config);
        Batches.waitForEnd(operator, id);
        return operator.getJobExecution(id);
    }
}

Note: of course since we know we execute batches we can move runBatch to be abstracted and use @Batch(name = “sample”) or something like that.

But wait, the part was about to get a main and we just got a command? Yes cause now our main can be Crest one: org.tomitribe.crest.Main.

Step 6: the big final, create the shade

Now we just need to shade it all together. Nothing very special here excepted we need to take care of java SPI (META-INF/services/*), of openwebbeans.properties files which need to be merged, to set the right main (Crest one) and if you use several BacthEE extensions with the BatchEE shortnames to merge batchee.xml:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-shade-plugin</artifactId>
  <version>2.4.2</version>
  <executions>
    <execution>
      <phase>package</phase>
      <goals>
        <goal>shade</goal>
      </goals>
      <configuration>
        <createDependencyReducedPom>false</createDependencyReducedPom>
        <shadedClassifierName>bundle</shadedClassifierName>
        <shadedArtifactAttached>true</shadedArtifactAttached>
        <transformers>
          <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
          <transformer implementation="org.apache.maven.plugins.shade.resource.XmlAppendingTransformer">
            <resource>META-INF/batchee.xml</resource>
          </transformer>
          <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
            <resource>META-INF/openwebbeans/openwebbeans.properties</resource>
          </transformer>
          <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
            <mainClass>org.tomitribe.crest.Main</mainClass>
          </transformer>
        </transformers>
      </configuration>
    </execution>
  </executions>
</plugin>

Then once built you will get a artifact-version-bundle.jar and you can execute your batch using:

java -jar artifact-version-bundle.jar sample --input=/opt/applications/sample/input/data.csv

Looks a lot of steps maybe but if you step back it is the main trick is to wait for the batch before the end of the execution. A nice bonus is to wire BatchEE persistence to a database to be able to deploy somewhere else BatchEE administration interface to be able to follow batch executions graphically.

Last thing before closing this post, since all the model is based on ApplicationComposer, testing these batches is as easy as using ApplicationComposerRule or runner :).

Happy (thanks)batching!

Integrating loggers with some frameworks sometimes requires to wire some stream (OutputStream typically) to the logger itself.

Most of the time you pass to your “LoggerOutputStream” a Logger instance and a Level – or you have some condition in the implementation to select the level.

This is not a hard task but for some loggers you need a big switch to handle the level. Code is quite noisy for pretty much nothing.

With Java 8 the implementation is pretty simple, just use a method reference and that is it!

Nice thing is you can automatically type it “Consumer<String>.

Here is a sample with maven Log API:

import java.io.IOException;
import java.io.OutputStream;
import java.util.function.Consumer;

import static java.util.Optional.of;

public class LoggerOutputStream extends OutputStream {
    private static final int BUFFER_SIZE = 1024;

    private byte[] buffer;
    private int count;
    private int bufferLen;

    private final Consumer<String> log;

    public LoggerOutputStream(final Consumer<String> log) {
        this.log = log;
        this.bufferLen = BUFFER_SIZE;
        this.buffer = new byte[bufferLen];
        this.count = 0;
    }

    @Override
    public void write(final int b) throws IOException {
        if (b == 0 || b == '\n') {
            flush();
            return;
        }

        if (count == bufferLen) {
            final byte[] newBuf = new byte[bufferLen + BUFFER_SIZE];
            System.arraycopy(buffer, 0, newBuf, 0, bufferLen);
            buffer = newBuf;
            bufferLen = newBuf.length;
        }

        buffer[count] = (byte) b;
        count++;
    }

    @Override
    public void flush() {
        of(count).filter(c -> c > 0).ifPresent(c -> {
            log.accept(new String(buffer, 0, c));
            count = 0;
        });
    }

    @Override
    public void close() {
        flush();
    }
}

And the usage is then pretty simple as well:

// getLog() is available in a Mojo = maven plugin
OutputStream out = new LoggerOutputStream(getLog()::info);

The interesting part is a bit hidden but you need to look for the method ref usage (log.accept(new String(buffer, 0, c));) where actually without it you would get a big switch to handle it so it is a nice win regarding the simplicity of the code and its evolutivity!

New posts here >>> rmannibucau.metawerx.net

New posts here >>> https://rmannibucau.metawerx.net

Category Archives: JavaSE

BatchEE + ApplicationComposer = Batch.main(args)

BatchEE + ApplicationComposer = Batch.main(args)

Step 1: define your application

Step 2: execute your batch

Step 3: ensure BatchEE is initialized properly

Step 4: handle the container lifecycle

Step 5: wait we need a main!

Step 6: the big final, create the shade

Loggers integrations love method reference (java 8)

BatchEE + ApplicationComposer = Batch.main(args)

Step 1: define your application

Step 2: execute your batch

Step 3: ensure BatchEE is initialized properly

Step 4: handle the container lifecycle

Step 5: wait we need a main!

Step 6: the big final, create the shade

Share this:

Share this: