Editor’s Note:This is the third post of a series highlighting the inspiring response to COVID-19 from the Google Maps Platform community. This week we’re focusing on how a variety of our partners and solution providers are offering services to help their end customers and communities worldwide. For a look back at projects that help individuals help others in their local communities, take a look at our last post.
With the onset of the COVID-19 pandemic, we noticed many of our technology partners and solution providers started offering their services for free, or began to work even more closely with their customers to develop solutions designed to help inform and support communities. Today we’re sharing a few examples of these partners and their customers who are helping drive innovation and creative projects to help people during these times.
Web Geo Services
Web Geo Services, a Google Cloud Premier Partner with a focus in location innovation, works with customers across the retail, logistics, finance, transport and hospitality sectors. They created their own consumer geolocation platform, Woosmap, which offers location-based APIs that augment Google Maps Platform. When COVID-19 began to spread, the WebGeoServices team started offering services and access to the Woosmap platform free of charge for up to six months. Here are three recent projects that leverage Woosmap:
Born in Reggio Emilia during the italian lockdown at a time when individuals were staying home to help limit the spread of COVID-19, a team of 3 digital experts in Italy developed vicino-e-sicuro. The project is an interactive map of all businesses that offer home delivery or click & collect: groceries, restaurants, bakeries and other essential services. From there, the team partnered with Web Geo Services to access the Woosmap platform for free. They have since developed NearbyAndSafe in the UK and Proxisur in France providing tools to support citizens and local businesses. Citizens can choose the type of services they need, consult information on delivery methods and prices for the service, and contact the merchant directly. Merchants, by registering with vicino-e-sicuro, NearbyAndSafe or Proxisur, can increase their visibility for free.
Valrhona also worked with Web Geo Services to build their interactive pastry map which allows users to find local pastry chefs and artisans in the US and Europe who sell their pastries, chocolates, bread, and other sweets using social distancing. Food and pastry are at the heart of so many cultures. Through the map, they’re able to direct people to passionate, hard-working chefs, as they continue working hard to provide others with familiar foods during an uncertain time.
Route4Me
Earlier this year Route4Me started offering their service free of charge to all government agencies at the federal, city, and municipal level across the world to support their efforts. Route4Me provides a route planning and mapping system that lets businesses find the most optimal route between multiple destinations. The platform automatically plans routes for many people simultaneously, creates a detailed route manifest, a map with pins and route lines, driving (or walking) directions, and dispatches the route directly to any smartphone. Their service will be available as an unlimited free subscription until the peak of the Coronavirus threat to the public is over. Let’s look at two projects that used Route4Me to deliver support to local communities.
The Foodbank of Santa Barbara County is working to provide enough healthy food to everyone who needs it in Santa Barbara County. The foodbank created an initiative called Safe Access to Food for Everyone (SAFE) Food Net. Of the 50 SAFE Food Net distributions they’re operating, nearly 20 brand new emergency drive-thru food distributions make receiving healthy food fast, easy, discreet and safe. In addition, the service provides a home delivery service for seniors that provides enrollees with home food deliveries. They worked with Route4Me to establish routes for this rapidly-growing home delivery initiative. Annually, the Foodbank serves 20,000 low-income seniors across the county.
Maverick Landing Community Services (MLCS) is a multi-service organization with a primary focus on helping children, youth, and adults to build 21st-century skills within Maverick Landing, East Boston, and surrounding communities. The MLCS team developed a COVID-19 response plan that not only required communicating directly with the community, but also provided a way to meet the community’s needs. MLCS worked with Route4Me to develop and use route maximization technology to increase expediency, efficiency, and reduce their carbon footprint while delivering grocery bags to keep the community fed and safe.
Unqork
Unqork developed the COVID-19 Management Hub, a solution available to any major city to integrate into its crisis management practice. The COVID-19 Management Hub application automates real-time mapping of the COVID-19 risk, maintains communication with residents in need, delivers critical services including food, medicine and other supplies, and coordinates multi-agency response and dispatch efforts in a single operations dashboard. Unqork is currently working with cities, states, and counties including City of New York and Washington D.C. to help develop their COVID-19 Management Hub for city officials to manage the pandemic and provide access to critical information and resources. So far in New York the system has enabled the delivery of over 8 Million meals and the collection of over $125M in PPE for front-line workers.
To learn more about the core features of Google Maps Platform that can be useful for building helpful apps during this time, visit our COVID-19 Developer Resource Hub.
For more information on Google Maps Platform, visit our website.
Redis is one of the most popular open source in-memory data stores, used as a database, cache and message broker. There are several deployment scenarios for running Redis on Google Cloud, with Memorystore for Redis our integrated option. Memorystore for Redis offers the benefits of Redis without the cost of managing it.
It’s important to benchmark the system and tune it according to your particular workload characteristics before you expose it in production, even if that system depends on a managed service. Here, we’ll cover how you can measure the performance of Memorystore for Redis, as well as performance tuning best practices. Once you understand the factors that affect the performance of Memorystore for Redis and how to tune it properly, you can keep your applications stable.
Benchmarking Cloud Memorystore
First, let’s look at how to measure the benchmark.
Choose a benchmark tool There are a few tools available to conduct benchmark testing for Memorystore for Redis. The tools listed below are some examples.
In this blog post, we’ll use YCSB, because it has a feature to control traffic and field patterns flexibly, and is well-maintained in the community.
Analyze the traffic patterns of your application Before configuring the benchmark tool, it’s important to understand what the traffic patterns look like in the real world. If you have been running the application to be tested on Memorystore for Redis already and have some metrics available, consider analyzing them first. If you are going to deploy a new application with Memorystore for Redis, you could conduct preliminary load testing against your application in a staging environment, with Cloud Monitoring enabled.
To configure the benchmark tool, you’ll need this information:
The number of fields in each record
The number of records
Field length in each row
Query patterns such as SET and GET ratio
Throughput in normal and peak times
Configure the benchmark tool based on the actual traffic patterns When conducting performance benchmarks for specific cases, it’s important to design the content of the benchmark by considering table data patterns, query patterns, and traffic patterns of the actual system.
Here, we’ll assume the following requirements.
The table has two fields per row
The maximum length of a field is 1,000,000
The maximum number of records is 100 million
Query pattern of GET:SET is 7:3
Usual traffic is 1k ops/sec and peak traffic is 20k ops/sec
YCSB can control the benchmark pattern with the configuration file. Here’s an example using these requirements. (Check out detailed information about each parameter.)
The actual system contains various field lengths, but you can use only solid fieldlength with YCSB. So, configuring fieldlength=1,000,000 and recordcount=100,000,000 at the same time, the benchmark data size will be far from one of the actual systems.
In that case, run the following two tests:
The test in which fieldlength is the same as the actual system;
The test in which recordcount is the same as the actual system.
We will use the latter condition as an example for this blog post.
Test patterns and architecture
After preparing the configuration file, consider the test conditions, including test patterns and architecture.
Test patterns If you’d like to compare performance with instances under different conditions, you should define the target condition. In this blog post, we’ll test with the following three patterns of memory size according to capacity tier.
Architecture You need to create VMs to run the benchmark scripts. You should select a sufficient number and machine types so that VM resources don’t become a bottleneck when benchmarking. In this case, we’d like to measure the performance of Memorystore itself, so VMs should be in the same zone as the target Memorystore to minimize the effect of network latency. Here’s what that architecture looks like:
Run the benchmark tool
With these decisions made, it’s time to run the benchmark tool.
Runtime options to control the throughput pattern You can control the client throughput by using both operationcount parameter in the configuration file, and the -target <num> command line option.
Here is an example of the execution command of YCSB:
The parameter operationcount=3000 is in the configuration file and running the above command. This means that YCSB sends 10 requests per second, and the number of total requests is 3,000. So YCSB throws 10 requests during 300sec.
You should run the benchmark with incremental throughput, as shown below. Note that a single benchmark run time should be somewhat longer in order to reduce the impact of outliers.:
Load benchmark data Before running the benchmark, you’ll need to load data to the Memorystore instance that you’re testing. Here is the example of a YCSB command for loading data:
Run benchmark Now that you have your data loaded and command chosen, you can run the benchmark test. Adjust the number of processes and instances to execute YCSB according to the load amount. In order to identify performance bottlenecks, you need to look at multiple metrics. Here are the typical indicators to investigate:
Latency YCSB outputs latency statistics such as average, min, max, 95th and 99th percentile for each operation such as READ(GET) and UPDATE(SET). We recommend using 95th percentile or 99th percentile for the latency metrics, according to customer service-level agreement (SLA).
Throughput You can use throughput for overall operation, which YCSB outputs.
Resource usage metrics You can check resource usage metrics such as CPU utilization, memory usage, network bytes in/out, and cache-hit ratio using Cloud Monitoring.
Performance tuning best practices for Memorystore
Now that you’ve run your benchmarks, you should tune your Memorystore using the benchmark results.
Depending on your results, you may need to remove a bottleneck and improve performance of your Memorystore instance. Since Memorystore is a fully managed service, various parameters are optimized in advance, but there are still items that you can tune based on your particular use case.
There are a few common areas of optimization:
Data storing optimizations
Memory management
Query optimizations
Monitoring Memorystore
Data storing optimizations
Optimizing the way to store data not only saves memory usage, but also reduces I/O and network bandwidth.
Compress data Compressing data often results in significant savings in memory usage and network bandwidth.
We recommend Snappy and LZO tools for latency-sensitive cases, and GZIP for maximum compression rate. Learn more details.
JSON to MessagePack Msgpack and protocol buffers have schemas like JSON and are more compact than JSON. And Lua scripts has support for MessagePack.
Use Hash data structure Hash data structure can reduce memory usage. For example, suppose you have data stored by the query SET “date:20200501” “hoge”. If you have a lot of data that’s keyed by such consecutive dates, you may be able to reduce the memory usage that dictionary encoding requires by storing it as HSET “month:202005” “01” “hoge”. But note that it can cause high CPU utilization when the value of hash-map-ziplist-entries is too high. See here for more details.
Keep instance size small enough The memory size of a Memorystore instance can be up to 300GB. However, data larger than 100GB may be too large for a single instance to handle, and performance may degrade due to a CPU bottleneck. In such cases, we recommend creating multiple instances with small amounts of memory, distributing them, and changing their access points using keys on the application side.
Memory management
Effective use of memory is important not only in terms of performance tuning, but also in order to keep your Memorystore instance running stably without errors such as out of memory (OOM). There are a few techniques you can use to manage memory:
Set eviction policies Eviction policies are rules to evict data when the Memorystore instance memory is full. You can increase the cache hit ratio by specifying these parameters appropriately. There are the following three groups of eviction policies:
Noeviction: Returns an error if the memory limit has been reached when trying to insert more data
Allkeys-XXX: Evicts chosen data out of all keys. XXX is the algorithm name to select the data to be evicted.
Volatile-XXX: evicts chosen data out of all keys with an “expire” field set. XXX is the algorithm name to select the data to be evicted.
volatile-lru is the default for Memorystore. Change the algorithm of data selection for eviction and TTL of data. See here for more details.
Memory defragmentation Memory fragmentation happens when the operating system allocates memory pages, which Redis cannot fully utilize after repeated write and delete operations. The accumulation of such pages can result in the system running out of memory and eventually causes the Redis server to crash.
If your instances run Redis version 4.0 or higher, you can turn on activedefrag parameter for your instance. Active Defrag 2 has a smarter strategy and is part of Redis version 5.0. Note that this feature is a tradeoff with CPU usage. See here for more details.
Upgrade Redis version As we mentioned above, activedefrag parameter is only available in Redis version 4.0 or later, and version 5.0 has a better strategy. In general, with the newer version of Redis, you can reap the benefits of performance optimization in many ways, not just in memory management. If your Redis version is 3.2, consider upgrading to 4.0 or higher.
Query optimizations
Since query optimization can be performed on the client side and doesn’t involve any changes to the instance, it’s the easiest way to optimize an existing application that uses Memorystore.
Note that the effect of query optimization cannot be checked with YCSB, so run your query in your environment and check the latency and throughput.
Use pipelining and mget/mset When multiple queries are executed in succession, network traffic caused by round trips can become a latency bottleneck. In such cases, using pipelining or aggregated commands such as MSET/MGET is recommended.
Avoid heavy commands on many elements You can monitor slow commands using slowlog command. SORT, LREM, and SUNION, which use many elements, can be computationally expensive. Check if there are problems with these slow commands, and if there are, consider reducing these operations.
Monitoring Memorystore using Cloud Monitoring
Finally, let’s discuss resource monitoring for predicting performance degradation of existing systems. You can monitor the resource status of Memorystore using Cloud Monitoring.
Even when you benchmark Memorystore before deploying, the performance of Memorystore in production may degrade due to various influences such as system growth and changes of usage trends. In order to predict such performance degradation at an early stage, you can create a system that will alert you or scale the system automatically, when the state of the resource exceeds a certain threshold.
You know me, I always learn something new and/or exciting. I assume most of you have similar habit. It has tons of advantages, like you can do cool things almost every day and you can provide several options when you are facing a challenge. ๐
Unfortunately sometimes you are struggling to find the best training source/type. Although there are million options, you should strive for diversity to avoid the “burnout”. To reach your goal, Today I wanna introduce you the Learn TV and Continued Investments in Microsoft Learn.
As usual the Microsoft makes huge efforts to support out professional development. For instance, Azure related documentation is far better than other Hyperscale’s’ related documentation. The Channel9 is another great source of knowledge… ๐
This is a really great question, and the answer is here about Learn TV:
Learn TV is a new content experience offering daily live, pre-recorded, and on-demand video programming for developers and engineers within the Microsoft Learn platform.
Jeff Sandquist, Corporate Vice President, Developer Relations.
Although the Learn TV is in preview, I am sure it will be live soon. This is the normal behavior in Microsoft.
Microsoft Q&A
This is a quite old know-how collection from Microsoft. Nevertheless this was not widely available for everyone. From now this is a general available for everyone. You can reach here:
This is a community for everyone who need help regarding the new technologies or who wants to learn from student leaders. Additionally you can be student leader easily now. It’s a really great chance to everyone, isn’t it?
This was the short introduction of these cool stuffs. Let’s start to discover them. I also suggest to create a learning plan for the summer to reach the highest efficiency as possible in summer time. ๐
If you have any feedback, positive or negative, it would be greatly appreciated if you could use contact form under the Contact menu.
Against a backdrop of continuous change, I’ve been struck by what remains constant—our partner ecosystem and customers are teaming with us across various industries and geographies to help people in incredible ways during this time. Together, we’re helping hospitals acutely impacted by COVID-19, retailers and grocers address dramatic shifts in consumer behavior, employees rapidly adjust to working at home, and IT teams ensure critical systems stay up and running.
Over the last few weeks, I’ve witnessed many individuals across Google Cloud and our partner ecosystem working hand-in-hand to collaborate, innovate, and do what is needed to help our customers. In today’s post, I will share a few examples of the agility and ingenuity across our partners and customers that have inspired us in recent weeks.
Helping healthcare providers address the pandemic
At the onset of COVID-19, healthcare and life sciences organizations faced strains upon their workforce, supply chains, and IT systems, and were tasked with keeping people around the world healthy. Below are just a few examples of how Google Cloud partners have helped these important organizations keep key teams connected and leverage data in their responses to COVID-19.
In São Paulo, our partner Loud Voice Services helped Hospital das Clínicas develop a voice assistant that manages the flow of scheduling appointments, exams, and medication administration using Google Cloud’s Speech API. This service helped Hospital das Clínicas respond virtually to certain requests, helping to reduce crowding in its facilities each day and improve safety practices for its patients and staff. Said Hospital das Clínicas Corporate Technology Director, Vilson Cobello Junior, “This project will allow the prioritization of the most critical patients who needed to be evaluated in person.”
Our partner SADA worked with HCA Healthcare to develop the COVID-19 National Response Portal, helping healthcare providers across the country share important data on ICU bed utilization, ventilator availability, and COVID-19 cases. “The National Response Portal is unique in the combination of data and technology it will bring together,” says Michael Ames, Senior Director of Healthcare and Life Sciences at SADA “The goal of the collaboration between SADA, HCA Healthcare, and Google Cloud is to provide the tools necessary to understand, manage, and end the COVID-19 pandemic.”
Google Cloud partner Maven Wave, an Atos Company, helped Amedisys, a leading provider of home health, hospice and personal care, build a chatbot using Dialogflow,the conversational AI platform that powers Contact Center AI, to allow employees to self-screen and report symptoms of COVID-19 through a simple voice interface. The chatbot can also provide Amedisys employees with up-to-date information on COVID-19, helping promote the safety of both patients and employees.
Supporting public sector responses to the pandemic
During the pandemic, governments and public sector agencies around the globe have been tasked with maintaining and expanding critical services, such as unemployment benefits and insurance, while keeping employees safe. We’re proud that our partners are providing key solutions and support to these public sector customers, including:
2020 unemployment claims in the State of Illinois are five times greater than the claims filed in the first weeks of the 2008 recession. To help the state manage this increase in inquiries, Quantiphi and Carahsoft, two Google Cloud services and software partners, used Google Cloud Contact Center AI to build a 24/7 chatbot capable of immediately answering frequently asked questions from people filing for unemployment. So far, this chatbot has been able to process and respond to over 3.2 million inquiries from filers, allowing the state to provide necessary information quickly and effectively to those in need of government assistance.
In the UK, Google Cloud Partner Ancoris helped Hackney Council, a government authority in East London, migrate to G Suite to help increase collaboration among its employees. When London began sheltering in place, the Council’s new collaboration tools became essential to the Hackney community. “In the first week of working from home, we saw a 700% increase in the use of Google Meet, which staff were using to work collaboratively in teams, even when apart,” says Henry Lewis, Head of Platform at Hackney Council. This enabled the Council to continue delivering its residents and businesses a wide range of essential services, including waste collection, housing assistance, and social care.
Helping educational institutions stay connected and accelerate research during COVID-19
In response to COVID-19, educators have had to quickly enable remote or work-from-home scenarios, while researchers quickly looked for ways to apply cloud computing capabilities to their search for effective therapies. Many universities began implementing technologies from Google Cloud and from our partners to enable remote work and to help accelerate important research, including:
Google Cloud and our partner Palo Alto Networks are enabling faculty, students and staff at Princeton University to stay securely connected to on campus resources, even when off campus. Palo Alto Networks partnered with the university’s Office of Information Technology to replace a legacy VPN solution with Prisma Access on Google Cloud, a solution that can secure remote access to the university community’s resources. Now, over 2,000 members of the university community are actively engaged in teaching, learning, research and administrative opportunities in a secure and scalable manner while being away from the Princeton campus.
Harvard University built VirtualFlow, an open-source drug discovery platform on Google Cloud, using our technology partner SchedMD’s Slurm platform to help accelerate research on COVID-19 treatment options. Now, Harvard is able to speed and scale up its testing of compounds to more quickly identify promising therapies to enter clinical trials.
Enabling financial services firms to support their clients
Financial services firms have had to maintain uptime and reliability of critical infrastructure while providing important products and services to their customers and the economic system at large. Our partners around the world are helping many of these institutions leverage Google Cloud technologies to do so.
In India, CMS Info Systems, a cash and payments solutions company, developed a program to provide cash to seniors at their homes. The company leveraged our partner MediaAgility’s Dista platform, built on Google Cloud and Google Maps Platform, to automate cash logistics workflow, provide real-time visibility of operations, and pick-up and deposit confirmation. This partnership will allow CMS Info Systems to ensure safe availability of cash and essential services for their more vulnerable customers during shelter-in-place procedures.
Google Cloud Partner Injenia helped Italian financial services provider Credem leverage G Suite and Google Meet to ensure business continuity during the pandemic. The bank is using collaboration tools like Drive and Docs to help employees adapt to the work-from-home environment around the country. Credem is also using Google Meet for seamless video conferencing to engage with its customers remotely accessing the bank’s consultancy.
Supporting companies in retail, media, and other industries
Many businesses, especially retail firms, telecommunications organizations, and media companies have had to address rapid shifts in consumer habits. To manage these shifts, it was critical to keep important teams connected, to leverage data to meet their customers’ needs, and ultimately to maintain business continuity. We’re proud that many of our partners have been key to helping these customers.
Due to the pandemic, TELUS International, a leading provider of customer experience and digital IT services, quickly transitioned tens of thousands of its employees to a work-from-home model. Many of these employees typically log into a desktop to access company software to connect with customers over voice, email and chat. Google Cloud Partner itopia helped TELUS International deploy a fully-configured virtual desktop environment in just 24 hours, allowing employees to remain connected and provide much needed service support to their clients. “Working with Google Cloud and itopia allowed us to transition key parts of our workforce—securely, globally and resiliently—all while keeping our team members engaged in what will certainly become part of the ‘new norm.,” says Jim Radzicki, CTO, TELUS International.
In Belgium, DPG Media, a leading media group in Benelux, wanted to limit the number of producers in their studios, while empowering them to create content from home. Google Cloud Partner g-company helped DPG Media migrate to G Suite to make this possible. With Google Meet, radio DJs can now facilitate the same fluid conversations between hosts during a broadcast, even with everyone working remotely in their home studios.
Grocery delivery wholesaler Boxed has seen increased demand for their goods and services during the COVID-19 pandemic. With essential supplies quickly coming in and out of availability, they needed a highly scalable database platform to manage their real-time supply chain and traffic levels. They collaborated with Google Cloud partner MongoDB to migrate their data to MongoDB Atlas on Google Cloud to help increase the scalability of their database and meet rising demand.
In Germany, our partner Cloudwürdig helped Burger King Germany roll out G Suite for its corporate employees, enabling them to more seamlessly work from home and maintain consistent operations across more than 700 restaurant locations. In fact, all of Burger King Germany’s internal meetings are being hosted over Google Meet during the pandemic. “We’re big fans of Google Meet. The speech quality is good and it’s helpful to be able to see every participant,” says Oliver Mielentz, Senior Manager IT, Burger King Deutschland GmbH. “This is extremely helpful these days.”
Looking ahead
I’d like to extend a heartfelt thanks to the many people behind the scenes across our partners, customers, and within Google Cloud who are working together to enable business continuity in so many key areas of our communities today. As many of us are juggling more than we ever have before, on both the personal and professional front, I’m especially struck at how our partners and customers are stepping up to the current challenges at hand.
Within our team, our commitment to you is stronger than ever—we’ll be right there alongside you, every step of the way as we help move businesses, and the world, forward.
You’ve built a beautiful, reliable service, and your users love it. After the initial rush from launch is over, realization dawns that this service not only needs to be run, but run by you! At Google, we follow site reliability engineering (SRE) principles to keep services running and users happy. Through years of work using SRE principles, we’ve found there are a few common challenges that teams face, and some important ways to meet or avoid those challenges. We’re sharing some of those tips here.
In our experience, the three big sources of production stress are:
Toil
Bad monitoring
Immature incident handling procedures
Here’s more about each of those, and some ways to address them.
1. Avoid toil Toil is any kind of work tied to running a production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows. This doesn’t mean toil has no business value; it does mean we have better ways to solve it than just manually addressing it every time.
Toil is pernicious. Without constant vigilance, it can grow out of control until your entire team is consumed by it. Like weeds in a garden, there will always be some amount of toil, but your team should regularly assess how much is acceptable and actively manage it. Project planners need to make room for “toil-killer” projects on an ongoing basis.
Some examples of toil are:
Ticket spam: an abundance of tickets that may or may not need action, but need human eyes to triage (i.e., notifications about running out of quota).
A service change request that requires a code change to be checked in, which is fine if you have five customers. However, if you have 100 customers, manually creating a code change for each request becomes toil.
Manually applying small production changes (i.e., changing a command line, pushing a config, clicking a button, etc.) in response to varying service conditions. This is fine if it’s required only once a month, but becomes toil if it needs to happen daily.
Regular customer questions on several repeated topics. Can better documentation or self-service dashboards help?
This doesn’t mean that every non-coding task is toil. For example, non-toil things include debugging a complex on-call issue that reveals a previously unknown bug, or consulting with large, important customers about their unique service requirements. Remember, toil is repetitive work that is devoid of enduring value.
How do you know which toilsome activities to target first? A rule of thumb is to prioritize those that scale unmanageably with the service. For example:
I need to do X more frequently when my service has more features
Y happens more as the size of service grows
The number of pages scale with the service’s resource footprint
And in general, prioritize automation of frequently occurring toil over complex toil.
2. Eliminate bad monitoring All good monitoring is alike; each bad monitoring is unique in its own way. Setting up monitoring that works well can help you get ahead of problems, and solve issues faster. Good monitoring alerts on actionable problems. Bad monitoring is often toilsome, and some of the ways it can go awry are:
Unactionable alerts (i.e., spam)
High pager or ticket volume
Customers asking for the same thing repeatedly
Impenetrable, cluttered dashboards
Service-level indicators (SLIs) or service-level objectives (SLOs) that don’t actually reflect customers’ suffering. For example, users might complain that login fails, but your SLO dashboard incorrectly shows that everything is working as intended. In other words, your service shouldn’t rely on customer complaints to know when things are broken.
Poor documentation; useless playbooks.
Discover sources of toil related to bad monitoring by:
Keeping all tickets in the same spot
Tracking ticket resolution
Identifying common sources of notifications/requests
Ensuring operational load does not exceed 50%, as prescribed in the SRE Book
3. Establish healthy incident management No matter the service you’ve created, it’s only a matter of time before your service suffers a severe outage. Before that happens, it’s important to establish good practices to lessen the confusion in the heat of outage handling. Here are some steps to follow so you’re in good shape ahead of an outage.
Practice incident management principles Incident management teaches you how to organize an emergency response by establishing a hierarchical structure with clear roles, tasks, and communication channels. It establishes a standard, consistent way to handle emergencies and organize an effective response.
Make humans findable In an urgent situation, the last thing you want is to scramble around trying to find the right human to talk to. Help yourselves by doing the following:
Create your own team-specific urgent situation mailing list. This list should include all tech leads and managers, and maybe all engineers, if it makes sense.
Write a short document that lists subject matter experts who can be reached in an emergency. This makes it easier and faster to find the right humans for troubleshooting.
Make it easy to find out who is on-call for a given service, whether by maintaining an up-to-date document or by writing a simple tool.
At Google, we have a team of senior SREs called the Incident Response Team (IRT). They are called in to help coordinate, mitigate and/or resolve major service outages. Establishing such a team is optional, but may prove useful if you have outages spanning multiple services.
Establish communication channels One of the first things to do when investigating an outage is to establish communication channels in your team’s incident handling procedures. Some recommendations are:
Agree on a single messaging platform, whether it be Internet Relay Chat, Google Chat, Slack, etc.
Start a shared document for collaborators to take notes in during outage diagnosis. This document will be useful later on for the postmortem. Limit permissions on this document to prevent leaking personally identifiable information (PII).
Remember that PII doesn’t belong in the messaging platform, in alert text, or company-wide accessible notes. Instead, if you need to share PII during outage troubleshooting, restrict permissions by using your bug tracking system, Google docs, etc.
Establish escalation paths It’s 2am. You’re jolted awake by a page. Rubbing the sleep from your eyes, you fumble around the dizzying array of multi-colored dashboards, and realize you need advice. What do you do?
Don’t be afraid to escalate! It’s OK to ask for help. It’s not good to sit on a problem until it gets even worse—well-functioning teams rally around and support each other.
Your team will need to define its own escalation path. Here is an example of what it might look like:
If you are not the on-call, find your service’s on-call person.
If the on-call is unresponsive or needs help, find your team lead (TL) or manager. If you are the TL or manager, make sure your team knows it’s OK to contact you outside of business hours for emergencies (unless you have good reasons not to).
If a dependency is failing, find that team’s on-call person.
If you need more help, page your service’s panic list.
(optional) If people within your team can’t figure out what’s wrong or you need help coordinating with multiple teams, page the IRT if you have one.
Write blameless postmortems After an issue has been resolved, a postmortem is essential. Establish a postmortem review process so that your team can learn from past mistakes together, ask questions, and keep each other honest that follow-up items are addressed appropriately.
The primary goals of writing a postmortem are to ensure that the incident is documented, that all contributing root cause(s) are well-understood, and that effective preventive actions are put in place to reduce the likelihood and/or impact of recurrence.
All postmortems at Google are blameless postmortems. A blameless postmortem assumes that everyone involved had good intentions and responded to the best of their ability with the information they had. This means the postmortem focuses on identifying the causes of the incident without pointing fingers at any individual or team for bad or inappropriate behavior.
Recognize your helpers It takes a village to run a production service reliably, and SRE is a team effort. Every time you’re tempted to write “thank you very much for doing X” in a private chat, consider writing the same text in an email and CCing that person’s manager. It takes the same amount of time for you and brings the added benefit of giving your helper something they can point to and be proud of.
May your queries flow and the pager be silent! Learn more in the SRE Book and the SRE Workbook.
Thanks to additional contributions from Chris Heiser and Shylaja Nukala.
Apache Spark is commonly used by companies that want to explore large amounts of data and perform additional machine learning (ML)-related tasks at scale. Data scientists often need to examine these large datasets with the help of tools like Jupyter Notebooks, which plug into the scalable processing powerhouse that is Spark and also give them access to their favorite ML libraries. The new Dataproc Hub brings together interactive data research at scale and ML from within the same notebook environment (either from Dataproc or AI Platform) in a secure and centrally managed way.
With Google Cloud, you can use the following products to access notebooks:
Dataprocis a Google Cloud-managed service for running Spark and Hadoop jobs, in addition to other open source software of the extended Hadoop ecosystem. Dataproc also provides notebooks as an Optional Component and is securely accessible through the Component Gateway. Check out the process for Jupyter notebooks.
Although both of those products provide advanced features to set up notebooks, until now,
Data scientists either needed to choose between Spark and their favorite ML libraries or had to spend time setting up their environments. This could prove cumbersome and often repetitive. That time could be spent exploring interesting data instead.
Administrators could provide users with ready-to-use environments but had little means to customize the managed environments based on specific users or groups of users. This could lead to unwanted costs and security management overhead.
Data scientists have told us that they want the flexibility of running interactive Spark tasks at scale while still having access to the ML libraries that they need from within the same notebook and with minimum setup overhead.
Administrators have told us that they want to provide data scientists with an easy way to explore datasets interactively and at scale while still ensuring that the platform meets the costs and security constraints of their company.
We’re introducing Dataproc Hub to address those needs. Dataproc Hub is built on core Google Cloud products (Cloud Storage, AI Platform Notebooks and Dataproc) and open-source software (JupyterHub, Jupyter and JupyterLab).
By combining those technologies, Dataproc Hub:
Provides a way for data scientists to quickly select the Spark-based predefined environment that they need without having to understand all the possible configurations and required operations. Data scientists can combine this added simplicity with existing Dataproc advantages that include:
Agility provided by ephemeral (short-lived or job-scoped, usually) clusters that can start in seconds so data scientists don’t have to wait for resources.
Scalability: managed by autoscaling policies so scientists can run research on sample data and run tests at scale from within the same notebook.
Durability: backed by Cloud Storage outside of the Dataproc cluster, which minimizes chances of losing precious work.
Facilitates the administration of standardized environments to make it easier for both administrators and data scientists to transition to production. Administrators can combine this added security and consistency with existing Dataproc advantages that include:
Flexibility: implemented by initialization actions that run additional scripts when starting a cluster to provide data scientists with the libraries that they need.
Velocity: provided by custom images that minimize startup time through pre-install packages.
Availability: supported by multiple master nodes.
Getting started with Dataproc Hub
To get started with Dataproc Hub today, using the default setup:
4. Choose Dataproc Hub from the Smart Analytics Frameworks menu.
5. Create the Dataproc Hub instance that meets your requirements and fits the needs of the group of users that will use it.
6. Wait for the instance creation to finish and click on the OPEN JUPYTERLAB link.
7. This should open a page that shows you either a configuration form or redirects you to the JupyterLab interface. If this is working, keep note of the URL of the page that you opened.
8. Share the URL with the group of data scientists that you created the Dataproc Hub instance for. Dataproc Hub identifies the data scientist when they access the secure endpoint and uses that identity to provide them with their own single-user environment.
Predefined configurations
As an administrator, you can add customization options for data scientists. For example, they can select a predefined working environment from a list of configurations that you curated. Cluster configurations are declarative YAML files that you define by following these steps:
Store the YAML configuration files in a Cloud Storage bucket accessible by the identity of the instance that runs the Dataproc Hub interface.
Repeat this for all the configurations that you want to create.
Sets an environment variable with all the Cloud Storage URI of the relevant YAML files when creating the Dataproc Hub instance.
Note: If you provide configurations, a data scientist who accesses a Dataproc Hub endpoint for the first time will see the configuration form mentioned in Step 6 above. If they have a notebook environment running at the URL, Dataproc Hub will redirect them directly to their notebook.
Cloud Identity and Access Management (Cloud IAM) is central to most Google Cloud products and provides two main features for our purposes here:
Identity: defines who is trying to perform an action.
Access: specifies whether an identity is allowed to perform an action.
In the current version of Dataproc Hub, all spawned clusters use the same customizable service account, set up by following these steps:
An administrator provides a service account that will act as a common identity for all spawn Dataproc clusters. If not set, the default service account for Dataproc clusters is used.
When a user spawns their notebook environment on Dataproc, the cluster starts with that identity. Users do not need the roles/iam.serviceAccountUser role on that service account because Dataproc Hub is the one spawning the cluster.
Tooling optimizations
For additional tooling that you might want for your specific environment, check out the following:
Use Dataproc custom images in order to minimize the cluster startup time. You can automate this step by using the image provided by the Cloud Builder community. You can then provide the image reference in your cluster configuration YAML files.
Extend Dataproc Hub by using theDataproc Hub Github repository. This option runs your own Dataproc Hub setup on a Managed Instance Group, similar to the version hosted on AI Platform Notebooks but including additional customization capabilities, such as custom DNS, identity-aware proxy, high availability for the front end, and options for internal endpoint setup.
Both Dataproc Hub on AI Platform Notebooks and its extended version on Managed Instance Groups share the same open-sourced Dataproc Spawner and are based on JupyterHub. If you want to provide additional options to your data scientists, you can further configure those tools when you extend Dataproc Hub.
If you need to extend Dataproc Hub, the Github repository provides an example that sets up the following architecture using Terraform:
Next steps
Get familiar with the Dataproc spawner to learn how to spawn notebook servers on Dataproc.
Get familiar with the Dataproc Hub example code in Github to learn how to deploy and further customize the product to your requirements.
Google Cloud is announcing the beta release of smart analytics frameworks for AI Platform Notebooks. Smart Analytics Frameworks brings closer the model training and deployment offered by AI Platform with the ingestion, preprocessing, and exploration capabilities of our smart analytics platform. With smart analytics frameworks for AI Platform Notebooks, you can run petabyte-scale SQL queries with BigQuery, generate personalized Spark environments with Dataproc Hub, and develop interactive Apache Beam pipelines to launch on Dataflow, all from the same managed notebooks service that provides Google Cloud AI Platform.
These new frameworks can help bridge the gap between cloud tools and bring a secure way to explore all kinds of data. Whether you’re sharing visualizations, presenting an analysis, or interacting with live code in more than 40 programming languages, the Jupyter notebook is the prevailing user interface for working with data. As data volumes grow and businesses aim to get more out of that data, there has been a rapid uptake in the types of data pipelines, data source availability, and plugins offered by these notebooks. While this proliferation of functionality has enabled data users to discover deep insights into the toughest business questions, the increased data analysis capabilities have been coupled with increased toil: Data engineering and data science teams spend too much time with library installations, piecing together integrations between different systems, and configuring infrastructure. At the same time, IT operators struggle to create enterprise standards and enforce data protections in these notebook environments.
Our new smart analytics frameworks for AI Platform Notebooks powers Jupyter notebooks with our smart analytics suite of products, so data scientists and engineers can quickly tap into data without the integration burden that comes with unifying AI and data engineering systems. IT operators can also rest assured that notebook security is enforced through a single hub, whether the data workflow is pulling data from BigQuery, transforming data with Dataproc, or running an interactive Apache Beam pipeline. End-to-end support in AI Platform Notebooks allows the modern notebook interface to act as the trusted gateway to data in your organization.
How to use the new frameworks
To get started with a smart analytics framework, go to the AI Platform Notebooks page in the Google Cloud Console. Select New Instance, then from the Data Analytics menu choose either Apache Beam or Dataproc Hub. The Apache Beam option will launch a VM that is pre-configured with an interactive environment for prototyping Apache Beam pipelines on Beam’s direct runner. The Dataproc Hub option will launch a VM running a customized JupyterHub instance that will spawn production-grade, isolated, autoscaling Apache Spark environments that can be pre-defined by administrators but personalized by each data user. All AI Notebooks Platform frameworks come pre-packaged with BigQuery libraries, making it easy to use BigQuery as your notebook’s data source.
Apache Beam is an open source framework that unifies batch and streaming pipelines so that developers don’t need to manage two separate systems for their various data processing needs. The Apache Beam framework in AI Platforms Notebooks allows you to interactively develop your pipelines in Apache Beam, using a workflow that simplifies the path from prototyping to production. Developers can inspect their data transformations and perform analytics on intermediate data, then launch onto Dataflow, a fully managed data processing service that distributes your workload across a fleet of virtual machines with zero to little overhead. With the Apache Beam interactive framework, it is easier than ever for Python developers to get started with streaming analytics, and setting up your environment is a matter of just a few clicks. We’re excited to see what this innovative community will build once they start adopting Apache Beam in notebooks and launching Dataflow pipelines in production.
In the past, companies have hit roadblocks along the cloud journey because it has been difficult to transition from the monolithic architecture patterns that are ingrained into Hadoop/Spark. Dataproc Hub makes it simple to modernize the inefficient multi-tenant clusters that were running on prem. With this new approach to Spark notebooks, you can provide users with an environment that data scientists can fully control and personalize in accordance with the security standards and data access policies of their company.
The smart analytics frameworks for AI Notebooks Platform is a publicly available beta that you can use now. There is no charge for using any of the notebooks. You pay only for the cloud resources you use within the instance: BigQuery, Cloud Storage, Dataproc, or Compute Engine.Learn more and get started today.
“The SharePoint Framework (SPFx) is a page and web part model that provides full support for client-side SharePoint development, easy integration with SharePoint data, and support for open source tooling.” – as Microsoft writes on its official documentation site. https://docs.microsoft.com/en-us/sharepoint/dev/spfx/sharepoint-framework-overview
The dark side of this story is that, SPFx is very under documented. ๐ This means you can’t find really good examples like case studies for Azure. Therefore when you should do a simple solution which creates at least 2 lists with a Lookup field between them, you are in trouble, because Microsoft doesn’t spend enough time to provide you a great documentation.
Luckily, I must solve this and my ambition pushes me toward the solution. After several days (in this case 6 working days), I spent in front of my laptop and tried to collect the crumbs of the information from the Internet…I made it. And now I would like to share it with you to avoid the sucks if you have to do this stuff. ๐
Our target for Today to create an SPFx based app to SharePoint online which creates 2 lists with a Lookup field.
Step 1: Prepare the solution in Visual Studio code
For this the Microsoft provides a quite good overview documentation. Nevertheless I put here some extra steps for the better result.
Configure msvs_version. This depends on your computer so you can use the following version related configurations. Choose one which works for you: npm config set msvs_version 2017 or npm config set msvs_version 2019
Now you can start to create your solution.
Step 2: Create basic project/solution in Visual Studio code
Open the root directory of your development location on your computer.
Create a directory for your solution. Today this would be multiple-lists-spfx
Step into the newly created directory
Open Visual Studio code on this directory From PowerShell: code . From Windows explorer: Open with Code
Open a Terminal, then enter the following command: yo @microsoft/sharepoint
Preconfigure the project:
Accept the default multiple-lists-spfx as your solution name, and then select Enter.
Select SharePoint Online only (latest), and then select Enter.
Select Use the current folder as the location for the files.
Select N to require the extension to be installed on each site explicitly when it’s being used.
Select N on the question if solution contains unique permissions.
Select WebPart as the client-side component type to be created.
Web part name: MultipleLists
Web part description: MultipleLists description
Accept the default No JavaScipt framework option for the framework, and then select Enter to continue.
At this point, Yeoman installs the required dependencies and scaffolds solution files. It takes several minutes.
Post-configuration / additional checks
1. Open gulpfile.js and replace its content to the following – to use gupl as a global variable
2. Open package.json and check gulp version. Do not use the latest gulp version, because that doesn’t work well with SPFx. Use this exact version
"gulp":ย "~3.9.1"
3. Create the following folder structure to root folder
sharepoint
- assets
- solution
4. Check if deasync was blocked by your Anti-Virus software or not. If yes, please execute these
npm install deasync@0.1.19
cd node_modules\deasync
node .\build.js
cd ../..
5. Trusting the self-signed developer certificate. (this also tests the whole of your project/solution configuration such as gulp version, deasync)
gulp trust-dev-cert
6. Install Insert GUID extension for Visual Studio Code. With this you can easily generate GUIDs for:
Guide instructions for the future:
For content types. Item related content type id: 0x0100 + <Insert GUID – 5>
For Filed id: 0x0100 + <Insert GUID – 2>
For id’s in package-solution.json: 0x0100 + <Insert GUID – 1>
Done ๐
Step 3: Create your List creation code
This is the most important part. The others above are “only” the preparation. You can apply this part in case of any type of client-side component such as WebPart and Extension.
1. After you created you basic project/solution, please create the following files into sharepoint/assets directory
elements.xml: this contains the site columns, content types and list instance definitions
primarySchema.xml: this contains the primary list related definitions, such as views and forms
secondarySchema.xml: this contains the secondary list related definitions, such as views and forms
2. Now start to edit elements.xml
<?xml version="1.0" encoding="utf-8"?>
<Elements xmlns="http://schemas.microsoft.com/sharepoint/">
<!-- IsActive global field -->
<Field ID="{9b9df7c1-8dca-4954-90c1-8dcf131e30af}" Name="IsItActive" DisplayName="Active" Type="Boolean" Required="FALSE" Group="CloudSteak Columns" />
<!-- Content type for Secondary list -->
<ContentType ID="0x010036ee6af136ed47a48c82fb0916a627ba" Name="SecondaryCT" Group="CloudSteak Content Types" Description="Sample content types from web part solution">
<FieldRefs>
<FieldRef ID="{9b9df7c1-8dca-4954-90c1-8dcf131e30af}" />
</FieldRefs>
</ContentType>
<!-- Secondary list -->
<ListInstance CustomSchema="secondarySchema.xml" FeatureId="00bfea71-de22-43b2-a848-c05709900100" Title="Secondary" Description="Secondary List" TemplateType="100" Url="Lists/Secondary">
</ListInstance>
<!-- Lookup field for Secondary list -->
<Field ID="{B2C98746-DE9D-4878-90C1-D3749881790F}" Name="SecondaryLookup" DisplayName="Secondary" Type="Lookup" ShowField="Title" List="Lists/Secondary" Required="TRUE" Group="CloudSteak Columns" />
<!-- Content type for Primary lists -->
<ContentType ID="0x010042D013E716C0B03B457EB2E6699537B99CFE" Name="PrimaryCT" Group="CloudSteak Content Types" Description="Sample content types from web part solution">
<FieldRefs>
<FieldRef ID="{B2C98746-DE9D-4878-90C1-D3749881790F}" />
</FieldRefs>
</ContentType>
<!-- Primary list -->
<ListInstance CustomSchema="primarySchema.xml" FeatureId="00bfea71-de22-43b2-a848-c05709900100" Title="Primary" Description="Primary List" TemplateType="100" Url="Lists/Primary">
</ListInstance>
</Elements>
Last week I presented how you can create easily Azure DevOps Pipeline on YAML basis. Today I’m gonna show you how you can do this on scheduled way. (you can find the video version end of this post)
I know, several times you need to execute a scripts in every hours or days. Maybe until now you had a dedicated machine, where you did this. But from now you can eliminate that and you can use Azure DevOps Pipeline with time trigger for it.
In my post from last week (HandsOn โ Azure Pipelines with YAML), you can read how you can prepare a project on GitHub and Azure DevOps. I made some tiny changes for Today. I mean, I put the code to an Azure DevOps repository. So you can use the guide from last week for preparation. Note: if you have an azure DevOps Repository, there is a fork at point 3 under Step 5. Here you should choose Azure Repos Git option.
The further steps are almost same. When you open the pipeline you can see the code we made last time.
Now we schedule our code from YAML. For this go to our Visual Studio Code where I opened the code after the git clone action. Open our YAML file. Insert the time trigger section to top of the existing YAML file.
always: true – this is important if you really want to execute this code scheduled. Without it the trigger won’t work well.
cron – this is the standard crontab related scheduling. If you are not familiar with this configuration, open the https://crontab.guru in your browser which helps you to find the right configuration. Be careful with timing because if you schedule this for a special date and time – like 10AM every day, in Azure DevOps it will run 10AM every day in UTC.
Last but not lease please enter a user friendly display name for scheduling. This is useful, because you can use multiple schedules for your pipeline.
Save this file, then push back to your repository. Here you should know the pipeline will run immediately when you commit the changes to repo. This is the normal behavior here.
Now, let’s check the pipeline scheduling. For this, inside Azure DevOps, navigate to your Organization > your Project > Piplelines > YAML Pipeline. Here click on the three vertical dots to see the additional options and choose the Scheduled runs.
Fantastic….you can see the configured schedulings:
Some minutes later you can see your pipeline runs according to your configured schedule.
I hope you feel this is very useful. Next time we continue the playing with Azure DevOps pipelines.
Last week I published an article about the new capabilities of Azure DevOps pipelines (with YAML). As I promised we continue this topic on Today. Although the last week provided documentation contains the most important information, I feel I should show you how it works in the real life. Why? Because the documentation is not the best for that….as usual.
Scenario for today: Create a basic NodeJS project which is stored on GitHub repository. This NodeJS project is a simple script which will be executed by Azure DevOps pipeline via a YAML file.
After this “episiode” you are able to execute automatically any of your existing NodeJs script or solution in Azure DevOps pipeline. This means you will have a CICD-like solution from your existing NodeJS solution. It’s cool, isn’t it?
Additionally, I’ve decided to create a video guide for this article, which helps you to see the whole flow, step-by-step. You can find it below.
Step 1: Create a GitHub repository for your code
As usually the first step to create a private or public repository on GitHub. This is a basic steps and I assume you can do it. ๐
Step 2: Write your basic code with YAML file
Next step after GitHub repository creation to clone the repository to the local computer. For this I use Visual Studio code.
Open your cloned repository and start to create the NodeJS solution.
Create the following 3 files:
package.json: This contains the project related information and the required package list for your solution.
startPilepine.js: This is our very basic existing NodeJS solution. Later on you can replace it to your existing solution.
pipeline.yaml: According to this file the Azure DevOps will build the pipeline. So this file contains the pipeline related information such as environment settings, pipeline steps, triggers.
When our solution is ready to use, we should push to our newly created GitHub repository. Here there is no any special steps, merely you should commit and push the changes.
Step 4: Create Azure DevOps project
Our solution is located on GitHub, so we are almost done. Now we create a project in Azure DevOps where we will create our pipeline. For this you should follow the following steps:
Inside the selected organization click on New Project button, top right corner of the screen.
In the Create new project window enter the Project name, Description.
Select the required visibility of this project; Public or Private. Be careful, this is not equal with the visibility of your GitHub repository.
Inside Advanced section the Version control should be Git.
If everything is ok, click on Create button
Step 5: Create pipeline
Inside the newly created Azure DevOps project go to Pipelines section (left side menu)
Then click on Create Pipeline button, middle of the screen
At next screen select the GitHub (YAML) from the list
Next steps to Select a repository from GitHub. If you are not connected from Azure DevOps to GitHub yet, you have to do it here. If you are connected from Azure DevOps to GitHub you can see the list of your repositories and you can choose the required one.
When you select the required repository you can see the Approve & Install Azure Pipelines window where you have to grant access to your repository to Install pipelines.
After this, you can start the configuration of pipeline. For this please choose the Existing Azure Pipelines YAML file options at the Configure your pipeline screen.
Then a sidebar appears on the right side of the screen (Select an existing YAML file) where you must select the YAML (files from the required branch) which contains the pipeline related information. Select the pipeline.yaml then click on Continue button.
Accordingly the pipeline has been created after some seconds. Here you can Run immediately it or just Save.
From this you have a Pipeline from a YAML file.
Step 6: Execute the pipeline
Great, we have a pipeline. To execute this pipeline we have now 2 (two) options:
Run manually from Azure DevOps
Navigate to Azure DevOps > Your organization > Your project > Pipelines > Pipelines > All
Click on name of your YAML based pipeline
Finally click on Run pipeline
Triggered by GitHub push
Modify your code your code (in Visual Studio Code)
Then push the modification to GitHub repository
This will trigger the pipeline to start to execute.
Step 7: Check result
This is a very simple thing.
Navigate to Azure DevOps > Your organization > Your project > Pipelines > Pipelines > All
Click on name of your YAML based pipeline
Select the required Run from the list
At the new screen click on Job node_13_x
Here you can see the details of execution
We are done with the scope of scenario. ๐
Last but not least. You can find the video, if you like the video content better:
I mentioned several times, my favorite area is the automation. I like to make automation stuff because that is the best way to make other’s life easier. ๐ Maybe you remember, last year I published a series which title was E2E Python solution in DevOps (Azure DevOps).
Since then the capabilities of Azure DevOps is better than ever. I like to use it for Scrum management, code repository, CI/CD activities, etc. And two days ago Microsoft announced another cool thing: General Availability of YAML CD features in Azure Pipelines. Why is this so important? Because from now you can use your YAML knowledge to create great CI/CD pipelines in Azure DevOps. ๐
As Microsoft highlighted:
YAML CD features introduces several new features that are available for all organizations using multi-stage YAML pipelines. Some of the highlights include.
Therefore I’ve decided to start to create some solutions with YAML in Azure DevOps, and I’ll provide you as a guide to use it in your daily job.
If you are not familiar in this topic, and you feel you need some “training”, you can find some useful information here:
So the next automation related post I’ll bring a YAML based pipeline solution for you to discover the capabilities of YAML based pipelines.
If you have any idea or recommendation about scenarios we should implement, feel free to contact me. And get in touch if you are struggling any automation related topic.
Everybody’s at home and tries to find an interesting topic and goal to spend the time useful. Or feels exhausted? Therefore you need to find a exciting activity or area where you can do some relaxing in the world of technology. Today I would like to suggest a great topic which is both exciting and useful. 2in1! It’s cool, isn’t it?
After the first quarter of 2020 you could read in financial reports: the On-Premise related revenues is higher than cloud related ones. Why? Maybe, during the current situation most of companies decreased their cloud related activities. Nevertheless this is not the best strategy in long term. I would not like to make financial analysis now, merely draw your attention, this is a great chance to plan and test “How you can on-premise datacenter to Azure”. And here there is another big “why?”.
Now the time pressure is lower, the workload is far from the usual. Additionally you may feel you don’t have huge motivation for your everyday activities. No problem, let’s start to do some exciting, something new, or something which is waiting for you… ๐ The migration planning and playing with migration is one of the best activities for this purpose.
Azure has a prominent part that supports the migration of external resources (such as on-site) to Azure. This is the Azure Migrate. With this you have chance to migrate your on-premise services to Azure on a partial managed way.
You can migrate tons of resources like:
Windows and Linux servers from Hyper-V
Windows and Linux serves from VMWare
Windows and Linux Physical servers
Databases
MS SQL
MySQL
MariaDB
Non-SQL
Web Applications
VDIs
To start, I suggest to start the read the documentation above. Nevertheless, if you just would like to jump into the middle of this topic, let’s start on Portal. In Azure Portal, when you open Azure Migrate service page, you can realize this is a well structured and understandable process. Please don’t forget the migration is never an easy (next-next-finish) process.
I wouldn’t like to create a step-by-step action plan for each and every resource type migration because that is impossible. Sorry for this bad news. Nevertheless I would like to share some useful links to migration. These links could help you to do your best to discover the resources which could be migrated then you can plan the migration (in time and steps).
Maybe you know, the “Lift and Shift” migration strategy is not the best solution for long term. Additionally there will be several services which cannot be migrated without re-development, rebuilding, restructuring. Here you should think about microservices. Then you should choose the right architecture pattern. You should “choose” one of from XaaS (IaaS, PaaS, SaaS, etc.). As you can see this is very-very complex topic. However this is a great chance to improve your knowledge and spend your time usefully.
I hope you spent well the Easter holiday and you had enough time for relaxing.
Luckily I feel good and the last week was really great. Therefore I would like to give you a post-Easter news bucket in this week. ๐
1. Action required: Azure Database
Maybe you also received an email with this subject. If not, you should know, Microsoft is going to change the public endpoint IP of MySQL, PostgreSQL Single server, and MariaDB in more regions.
Change time window: 20 May 2020 00:00 UTC – 30 June 2020 00:00 UTC.
Recommended action:
Ensure the firewall allows outbound traffic on port 3306 for MySQL/MariaDB and port 5432 for PostgreSQL to the new IP address(es).
Connect to your Azure Database for MySQL, PostgreSQL Single server, and MariaDB using the DNS record instead of the IP address.
2. Temporarily halted any deprecation enforcement of TLS 1.0 and 1.1
Since 2018 we know the TLS 1.0 and 1.1 will be deprecated in 2020. Nevertheless, due to the known reason, Microsoft postpone this until middle of summer.
3. Azure Monitor for virtual machines is now generally available
Yesterday, Microsoft updated their earlier announcement about Azure Monitoring for VMs. To be honest, I was very happy about the original announcement because I already liked this feature in preview.
Finally, a small but good news. Microsoft continues the improvement of Logic Apps for better usage and better automation. They started that in last November: Workflow automation with Logic Apps. And now they provided the actual progress. You can find the new announcement here: Workflow automation with Logic Apps
When you start your business and you have only a few Customers you are in an easy situation from performance perspective. I mean, you should provide some resources to your Customers for the right and acceptable performance.
Then your business is growing, you must serve your customers need, and increase system performance. Sometimes this is easy, sometimes not.
For this there is a quite common solution which helps you to provide a fast and reliable service. This is the Redis Cache which is available for almost every solutions in the cloud.
Redis is a great solution. You can find more information for the bigest hyperscalers’ offer:
All of them are great, reliable, easy to use and fast. Nevertheless Today I would like to draw your attention for a brand new feature which allows you to replicate your Redis Cache data among different regions. Is it cool, isn’t it?
From now AWS offers cross-region replication for this cache service.
How it helps?
This solution could be useful if the geolocal performance is important for you, or from disaster recovery point of view.
How it works?
When you have an existing Redis cluster (or you build a new one) somewhere the Global Datastores are supported. This will be the primary (active)
After some configuration steps AWS provisions the secondary (passive) Redis clusters to the selected regions. (You can create maximum 2 passive clusters)
Limitations
As you can see above there are several limitation of this feature.
You only can add READ caches for your primary cluster
Maximum number of replicas: 2
Global Datastore is available only some regions
ElastiCache doesn’t support autofailover from one AWS Region to another.
If you feel this could be a great opportunity to increase your business performance in the near future. My suggestion to start to discover this new feature.
Maybe you remember I published an article about the future of Microsoft certification. As you read: 2020 is year of the changes: Microsoft Azure Certification. Accordingly everyone started to reorganize their plans regarding the trainings and exams for this year.
Then last week Microsoft announced several significant changes which affect everything we planned since beginning of February.
I can see tons of question marks in your mind. Is it good news? Is it bad news? Let me quote something:
Ah, Shifu, there is just news. There is no good or bad.
Oogway (Kung Fu Panda)
So, don’t worry…Microsoft announced the following thins:
Testing centers closing and online proctoring capacity increasing
Reschedule and cancellation fees waived
Retirement of MCSA, MCSD, MCSE certifications and related exams extended
Expiring role-based certifications extended
Exam voucher and discount offer expiration dates extended
Moving to virtual training
In nutshell: Microsoft increases its online exams related capacity which is really good news. Additionally if you have any expiring certification, voucher or discount offer, it will be extended until January 31, 2021. Finally, all MCSE, MCSA, MCSD certification retirement date is postponed to January 31, 2021.
Luckily, last week I provided a post to you about the possibilities of Azure Online Exams. According our current information, Microsoft has enough capacity to manage our ambitions regarding online exams.
So Let’s go back and start to plan our career path again. ๐
In the last short period the online activity is higher than before. Most of you are at home and try to figure out how you can accomplish your planned Microsoft Exams, if the local Microsoft Learning Partners won’t be open. Yes, it’s hard. Nevertheless you have another option to take the planned exams. You can do it online. How? Microsoft offers online exams opportunity for you.
You can take any role-based or fundamentals exam online in the comfort of your home or office while being monitored by a proctor via webcam and microphone.
Microsoft Learning
You can find the most important information about this opportunity here: About online exams
Now, let’s go through the basics:
Step 1 – preparation
Before you would like to take an exam, be sure you are well prepared. In this phase you need spend several days with learning. This is the fundamentals of everything. I am sure you have tons of material for a really good learning and practicing. If not, I have some suggestions for you:
When you feel you are ready to exam, go to second step
Step 2 – technical check
After you have enough knowledge to accomplish the exam, you must to check your home equipment (like: internet, microphone, webcam, …) from online exam readiness point of view. For this follow the following steps:
I am sure you know the automation actions in GitHub. Official definition by GitHub:
Automate, customize, and execute your software development workflows right in your repository with GitHub Actions.
help.github.com
GitHub Actions are small tasks which help you to create customized workflows. With them you can step up to the highest level of development automation.
You can find several useful actions on GitHub Marketplace which support your efforts toward to better automation.
Additionally 2 days ago Docker announced that they had released their first GitHub Action. You can find some details here.
As you can see the example is simple and easy to follow regarding their approach. Obviously, this is not enough to achieve your goals quickly so I would like to provide some useful links where you will find the related information for a good start:
Nowadays, the environmental awareness is critical. I hope you do more and more for your and our environment and to decrease your carbon footprint. Additionally I am sure you do several efforts for sustainability. It’s also very important for me because I would like to save our planet earth. ๐
This behavior appears in my job. How? I spend more time to design the optimal architecture. And I try to avoid to build over-sized systems in cloud.
Nevertheless, what about the Enterprises? What do you think? Do they efforts for sustainability?
I know the answer is yes. Nevertheless, the carbon footprint calculation (and measurement) in the cloud is very difficult because we don’t have access to the hardware equipment. Obviously there are some methods how you can estimate your cloud resources related carbon emissions. Unfortunately this could be far from the real life.
But! From now you can get a new tool from Microsoft for Azure. This tool is the Microsoft Sustainability Calculator.
AS you can read in Microsoft announcement from January 2020: “Microsoft has been investing to reduce environmental impact while supporting the digital transformation of organizations around the world through cloud services.” This is a great news though this is just the beginning. I am sure this is not perfect yet. Nevertheless this calculation could be closer to real values, and accordingly you can start to the right direction to your company will be more environmentally conscious.
Before I finish this article you must know an important information about this tool with you: “The Microsoft Sustainability Calculator runs on Power BI Pro.”
My suggestion to start to test this tool on your development subscription then show the result to your managers. ๐
Last November Microsoft announced the Visual Studio Online Public Preview. “Visual Studio Online provides cloud-powered development environments for any activity” as Microsoft defines. This is a really great idea which requires an active Azure Subscription. Additionally you can manage your VSO environments from Visual Studio Code.
This was in last year. And now Microsoft provides some new updates to us such as Dockerfile support. Then the Real-time Collaborative Development is far stable than before.
I really like this feature because it could be useful for collaborative development and/or debugging. You can share your thoughts with your colleagues, during all of you can see the code in live.
When you work with python codes, day-by-day you want to reach the status where your code is well-managed and you can collaborate with contributors on the most efficient way.
You use git for code management and it most of the cases efficient from collaboration point of view. Nevertheless there are some situation or project where you need more or a little bit different.
For this there is a great “tool”. This is the Jupyter. Since 2014, they do their best to provide cool features for you. ๐
Today I wouldn’t like to make a Jupyter Notebook and -Hub introduction, merely I would like to share with you a great article about how you can configure your Visual Studio Code for Jupyter.