The objective of this talk is to illustrate how some of the “modern” MPI functions can be used to build new parallel tools in Julia and interface with existing native parallelism in Julia. The motivation for this work is to simplify working with libraries such as Trilinos. We will begin by outlining the differences between the Julia native parallelism and the MPI “single program, multiple data” approach. Next follows a gentle introduction to the MPI-3 calls (now available in MPI.jl master), which perform communication in a one-sided way, so not all processes need to call MPI routines at the same time. This property is very useful to build a custom Julia IO (similar to TCPSocket) for point-to-point communications over MPI. This IO can then be used to build a Julia cluster manager, enabling native Julia parallelism over asynchronous MPI communication. The steps required to build this system (code here) will be outlined, together with some caveats encountered when implementing the interface required for IO. Once a program is running on an MPI cluster manager, it becomes possible to switch on the fly between native Julia parallel calls and MPI calls, which we will illustrate using a simple example. Distributed arrays are another key area where one-sided MPI calls can be put to use. The MPIArrays.jl package provides such an array type. The advantage of using MPI directly rather than the Julia parallel framework is that the “single program, multiple data” paradigm is preserved, so interfacing with external libraries using MPI becomes trivial. Furthermore, MPI supports the high-performance interconnects such as Infiniband that are available on many clusters. We will present the basic ideas behind MPIArrays, and show how building the basic distributed AbstractArray takes just a few lines of Julia code. A benchmark using a dense matrix-vector product will show it compares favorably to DistributedArrays.jl and the C++ library Elemental. We will then finish the talk with a concrete example of using MPIArrays to build the sparsity structure of a linear system to be solved by the Trilinos C++ library.