|שם המשרה||פתיחת משרה||פקולטה / אגף|
|DevOps position in the High Performance Computing Section||07/03/2023||
תחום מחשוב עתיר ביצועים בענף תשתיות מחשוב באגף טכנולוגיות מידע
We are seeking to recruit a senior, highly motivated DevOps professional with at least 3 years’ experience in the GPU/AI operations fields, to play a key role in HPC/AI/Hybrid cloud systems operation and evaluation of new technologies to support frontier research activities of WIS scientists.
This individual will be part of a group that design & build HPC/AI/Cloud solutions, ensuring that upgrades and changes comply with product/projects management guidelines.
He/she will work under the head of HPC section supervising for the planning and development of a robust and scalable infrastructure for AI/ML/DL workloads, DL/ML frameworks integration and application profiling, researchers support.
השכלה וכישורים נדרשים:
* B.A./B.Sc in information technology or equivalent academic degree.
* Experience with GPU technologies and AI/ML/DL frameworks like Tensorflow, Mxnet, Pytorch, Keras.
* Experience supporting centralized systems, at the core of the data center.
* Familiarity and experience with systems performance analysis, benchmarking of standalone machines and HPC clusters, GPU workloads.
* Strong shell scripting knowledge, experience installing and maintaining clustered environments, including automated installation, patches updates and monitoring methods (Chef, Jenkins, Puppet, Ansible).
* Containers automation and orchestration (experience with Dockers, Kubernetes).
* Service/Customer oriented attitude.
* Strong troubleshooting skills.
* Strong interpersonal and communication skills.
* Ability to work as a team player.
* Proactive and solution-oriented problem solver.
* Experience working with public cloud service providers – AWS, GCP, Azure.
* M.Sc degree in information technology is an advantage.
* Experience with any of below HPC schedulers (Slurm, SGE, Torque/PBS, LSF or alike).
* Experience with CI/CD in complex distributed systems.
* Documenting system administration procedures for routine and complex tasks.
* Knowledge of storage operation – parallel filesystem performance oriented (GPFS, Lustre, OrangeFS, BeeGFS)
* Experience with Infiniband technology.