Department Overview - Data Solutions
The Data Solutions division is the commercial data business of Moody's Analytics, bringing together a vast catalogue of data to help business decision-makers assess risks and opportunities. We are best known for Orbis, the world's most useful and usable database of companies. We are also the definitive source of ratings-related data for companies and securities that have been rated by Moody's Investors Service. Data Solutions is also the home of NewsEdge, the premier collection of premium and web aggregated businesses news with more than 23,000 breaking news and key trade publications, web pages, blogs, and social media feeds. All are normalized, enriched, and delivered sub seconds of release. We strive to deliver convenience and insight to our customers by eliminating the hassle of sourcing, preparing, and accessing data, adding value when decisions need to be made. Any analytical activity depends on reliable and accessible data, and we are proud to be a mission-critical information resource for decision-makers around the world.
Domain Overview
Data Management Platform Data has become a critical asset for enterprises, playing a pivotal role in various aspects of business operations and strategic decision-making. Our proprietary datasets power a diverse set of products and services and have an ever-increasing set of demands for expanded coverage and data quality. Developing robust systems that leverage the latest in data-at-rest, data-in-motion and data remediation technologies is the cornerstone to our Engineering future. These systems must continue to service existing customers and super charge growth opportunities. Our Data Platform leverages best-in-class cloud technologies, frameworks with an emphasis on code isolation, simplifying data access controls and operations, and enhancing data security.
Role Overview
Experience & Qualifications:
- Able to build some data pipeline on any cloud services such as AWS, Azure, GCP etc.
- Able to work well within the constructs of an agile development process, including SCRUM, Unit Testing, Continuous Build, and Integration, etc.
- Willingness to learn innovative technologies, capable of self-directed learning.
- Exceptionally good verbal and written communication skills
- Ability to handle multiple assignments concurrently and independently manage time.
- Works well in a fast-paced team environment; able to work under pressure to meet tight deadlines
Desired Skills and Experience:
- Apache Spark
- Proficiency in programming languages, especially Python, is typically required.
- Experience in building and optimizing 'big data' data pipelines, and data sets. This includes designing, constructing, installing, testing, and maintaining highly scalable data management systems
- Understanding of data structures and algorithms, as well as skills in distributed computing. The ability to develop procedures for data mining, data modeling, and data production.
- Experience with cloud services such as AWS, Google Cloud, or Azure can be important, as many businesses utilize these platforms for their data storage and processing needs.
Nice to Have:
- Familiarity with data bricks delta lake
- Familiarity with Kafka streaming service.
Day-to-Day:
We are seeking a highly skilled and experienced python developer to join our team. Responsibilities will include, but are not limited to:
- Help maintain the existing data pipeline codebase
- Participate in exploratory and execution any POC
- Use Git for version control and collaborate with other developers on the team.
- Expand Unit test coverage.
- Implement the new data pipeline for the new data source.
- Adhere to Agile software development methodologies.
- Work with any data format such as JSON, parquet, avro, XML etc.
- Use Visual Studio as the primary development environment.
- Create APIs using cloud services like AWS gateway.