Best and very much compatible ETL for Redshift
1.)All the components and transfomrations availability
2.)Specially oriented for data ingestion process using all aws services like rds, s3, sns, redshift
3.)easy admin, ssl setup
4.)very easy diagromatical representation of jobs
the latest button change for "switch version"
Agile Data Engineering
We use Matillion to produce highly curated datasets for customers. Matillion has allowed us to move this workload entirely into the data warehouse where the final work product lives, and cut processing time from days to hours.
Matillion allowed us to move a legacy ETL pipeline to BigQuery with speed, precision, and confidence. It is now used exclusively by our team for all of our ELT needs. The user interface is clean, intuitive, and allows members of the team to be immediately productive. Onboarding new users has never been easier.
Feature parity between the different flavors (Redshift, Snowflake, BigQuery) is lacking, however, Matillion continues to make progress in narrowing the gap.
Need on the fly connectivity to different instance of Redshift
I like the bunch of API integrated in Matillion it helps me to directly grab the data from different sources without manually extra and import in my database.
I am using Matillion for Redshift from last more than 2 years. When i setup a project in Matillion with connection to a Redshift cluster, sometimes there is a need to extract the data from other instance of Redshift while running job in Matillion. In that case there is straight forward way of getting the data from other instance of Redshift like you have comments to extract the data from databases like SQL Server, Oracle etc. instead of first i need to run another job by connection second Redshift cluster and extract the records after that run different job to load the extracted records from S3 to main Redshift cluster.
I like working on matillion , but this tool can be made more flexible by providing some of the additional features.
1) Most of the components are user friendly.
2) Development of ETL orchestrations and transformation consumes less time.
3) Advanced features are available in some of the components makes the complex scenarios achievable.
1) OAuth document does not provide the details of proper permissions and access levels on account_id or client_id which are required for a connector.
2) Differentiation of naming convention of metrics between the console and matillion data model , mapping document of these naming conventions is not available.
3) Connector for Outbrain is not available.
4) Indirect file loading concept is not available , for example if we have five files with same structure , reg ex can not be used to load all files in a single component(s3 load,s3 put or excel)
and need to use the file iterator.
5) Loss of properties when the component configuration is changed from Basic to Advanced features , ideally the component should include all the properties of basic and then additional setting should be provided.
A Great Cloud-based Tool
I like it overall. It is easy to quickly learn and get up to speed. It is made to work in the environment I'm working in (AWS) and has a nice, clean browser-based interface to build and schedule jobs in EC2. Email-based support is responsive. However, I sometimes feel it is too expensive compared to alternatives for the value it adds and the ELT model, while beneficial in many ways, hamstrings the component feature-set.
Well integrated with AWS. Great browser-based interface. Develop in the same place where the code will run, as opposed to other solutions I use that develop in Windows and run in Linux. Easy to schedule.
I have mixed feelings on the overall value. The hourly rate really adds up over time. This is probably my biggest misgiving. Otherwise, I sometimes get frustrated with the ELT model because it means if the feature you need isn't supported within Redshift, it's up to you to create Python scripts or Bash scripts to enable it.
Have been using Matillion for past 9 months
We have developed Data Warehouse for our business using Matillion. We employ 6k+ people, and Matillion was a perfect match for our needs. Storing data in Google Big query allows seamless access to it from other Cloud Products I.E. Google Data Studio. I love the product, and limitless possibilities it gives us. Pulling the data from anything you may think of, transforming it and then outputting into Big Query is very easy and straight forward.
My background is development and I love the simplicity of the Matillion, you can very easy transform your thought process into blocks in Matillion, and debug any issues at a very granular level.
Matillion crashes from time to time, meaning you need to log back on to the system, but its benefits definitely outweigh this small issue.
Extremely easy to use, but hits hard limits when dealing with lots of data
Overall experience isn't so great for the price we pay. Because of our hard limits we have to pay for two instances in order to handle our job loads. That wouldn't be a problem if we were able to scale the EC2 instance the software sits on. It makes no sense why an application has a specific hardware limits. Surely if we can spin up two separate instances to run concurrent jobs that are unrelated to each other, the same principle can be applied when scaling up with more cores/memory etc.
The ease of use is definitely the strongest aspect of Matillion. The GUI is exceptionally intuitive and even though the application flow speaks for itself, the documentation is also great for additional support.
You can hit a hard limit quite fast with the hardware when working with concurrent jobs running (Java Heap issues etc). You cannot increase the size of the instance (m4.xlarge) which means it becomes increasingly difficult to optimize. We are always trying to find work-arounds with the amount of data we need to load.
Intuitive & Flexible
I love the GUI that Matillion employs - it makes creating and adjusting tasks very simple and transparent.
The scheduling of tasks is a little tricky, as we have to separately have to schedule the virtual machine to fire up and down. Once this is done though, it works faultlessly.
Great product for handling big data flow automation
Matiilion - Cloud based, scalable, available ELT for Big data
Overall it is user-friendly, easy to develop & deploy, secure ELT for big data use cases. It supports a lot of already built useful connectors, which helps to reduce marketability cost and time.
1) Easy to use
2) Several ingestion integrations
3) Custom RDBMS ingestion support
4) Robust transformation and Orchestration components
No availability of change data capture feature for many databases
Yes but ..
Best visual ELT on the market
Matillion helped us develop various business use cases: Ambient temperature monitoring, ERP data centralization, data quality management. Components such as change detection were really helpful.
With Matillion, you can harness the power of major cloud data warehouses in no time.
What I love most is that you can give it to data engineers as well as ETL engineers with no prior experience with big data.
Error management and replay can be tricky to develop.
Speed of ETL development and templated approaches.
I could build a complete ETL process with absolutely no training at all in just a few hours. All of the components I expected to see where there, and heaps more. When I needed help - there it was!
Versioning and branching is difficult when developing, there is a total reliance on a unique component name.
Great piece of software
Importing data from different sources.
Easy to use. Easy to move a lot of different processes into one place.
Problematic API support. ESC closing windows without any warning and I lose everything I prepared in that window! More support for operations not exactly about read like elasticsearch index creation or ability to do PUT, POST and DELETE with API.
Matillion and Snowflake ELT
Building a data warehouse with AWS and Snowflake. Easy and quite fast to build ELT.
easy to get into. Orchestrations and Transformations makes it easy to build flows. Push down to Snowflake
No standalone client, no easy way to automate/generate (somegow possible with grid variables though)
Best ETL(ELT) tool in the market for AWS Redshift
In love with the tool, I highly recommend it. It’s perfect for the data warehouse we are building.
Easy to use. Quicker ETL. Comprehensive documentation
Nothing comes to mind. Perhaps round the clock support would be helpful, especially for customers located in Australia.
Great Snowflake integration, poor API support
It solved some difficult problems for us. As long as you understand its limitation in certain areas.
Snowflake integration works very well. In database features were what I was looking for.
Had to build most of my API calls using their Python script tool.
Responsive to User Needs
Gets better and better with each release. Keep up the good work!
I've been using Matillion for about 2 and a half years. I've seen the software improve so much in that time. If I ever found a feature to be lacking, it would be included within the next couple release cycles. The software really seems to adapt as user needs have been evolving. Some of my favorite recent features are the ability to configure some components with just text (really saves me time) and the many Grid orchestration components which allow me to greatly reduce the complexity of the orchestration jobs.
UX is sometimes lacking or inconsistent. For example, let's say my goal is to replace an existing job with a new job of the same name. I delete the old job, and import the new one with the same name. All Orchestration components that used that job will understand to use the new job, but for some reason the scheduler doesn't.
Also, jobs will have validation errors simply because the components haven't been validated (grey borders). It can be confusing because you may be searching for an error in your work when all you actually have to do is revalidate the job.
I really like Matillion ETL but I was a bit disappointed with some of its limitations and quirks, like when Twitter updated his API all my jobs stopped working and the only thing I could do was develop everything again without using the Matillion component.
When I was trying to make upsert procedures on RDS Databases, I was thrilled that the Matillion component has that option, bit it simply didn't work. It seems like these functionalities haven't got the same attention of the others, I contacted the Matillion support which is simply great and these things are still going to be fixed.
When I wanted specific libraries of Python, I couldn't get it working on my instance because pip was out of date and it couldn't be updated. I tried to update and I messed up with my instance.
Overall, Matillion is great, when you're creating a flow of data migration and you need to parse the data before and do some automations, but there are some cases when you need something a little more specific, that's when it becomes a pain.
The ease to use, Matillion has a great set of tools that make complex and difficult process more simple and fast to develop.
There are many tools in Matillion that doesn't have enough attention and because of that these tools are fragile, while some of them do not work.
Matillion for Data Engineers
An easy to use tool that has allowed me to build a fully automated data warehouse to report to the business, it was easy to learn and implement, and as difficult as SQL to master. I think the real complexity comes from optimising complex workflows, the more T you need in the ETL/ELT process, the more Matillion falls down. This being said it is a very good tool for 90% of your work.
It is easy to use and quick to get started, there are many inbuilt APIs and transformations that integrate it across a broad suite of systems, and enable you to transform data for the business without having to spend too much time thinking about scripts or bespoke batch files.
The filter transformer is pretty minimal considering the power of the SQL it is utilising, it cannot use compound logic or reference other columns/objects in the query.
There is also optimisation issues in very complex workspaces.
Both of these have easy to use workarounds but it is unfortunate it is not supported more at surface level.
Fast track to results
Very helpful support which takes care of all our problems, especially with "stupid" beginner questions.
Very easy to use and very easy to get results. Fast implementation cycles compared to other tools. Best is to loop over variables which enables simple but powerful solutions.
Sometimes more flexibilty on specific tasks, e.g. Excel import or csv.