Secure Coding Best Practices & WAAP for Application Hardening (Sanjay – Executive Director, MSCI)

Overview

In this podcast, Sanjay (Executive Director, MSCI) talks to Venky about secure coding best practices & methods to handle customer-sensitive data. 

He also shares why securing software isn’t an accident and requires cautious efforts at an organizational level to make it possible. 

Indusface
Indusface

Indusface is a leading application security SaaS company that secures critical Web, Mobile, and API applications of 5000+ global customers using its award-winning fully managed platform that integrates web application scanner, web application firewall, DDoS & BOT Mitigation, CDN, and threat intelligence engine.

Key highlights from the discussion :
  • About Sanjay's experience & projects (Microsoft, Corel & Salesforce)
  • Data encryption at rest and in transit
  • The concept of - Dynamic Data Masking
  • Security problems, despite good coding practices
  • Security practices across each stage of the organizational processes
  • Methods/Steps while accessing customer-sensitive data
  • WAF & WAAP for Application Hardening
  • Utilizing generative AI for building improved security models
  • Three guiding principles to follow in cybersecurity

Transcript

I co-founded a security startup company in Bangalore. And then shortly after, I moved to Canada to work at the company Corel, where I eventually led the Corel WordPerfect and the Corel Ventura teams for multiple releases. 

And in 2003, I applied for a job at Microsoft. In Microsoft, I led the team responsible for SQL Server Management Studio, SQL Server Data tools and Visual Studio, and SQL PowerShell. 

I remember ADO.net or ADD framework, all the connectivity drivers .NET, Java, Python for SQL Server and Azure Super SQL database, and the entire DevOps toolchain, including what’s Visual Studio Code and what is now called Azure Data Studio. 

So, I was at Microsoft during a truly transformational period. 

Ajax Database Azure SQL Database, for example, taught me a ton about creating and operating PaaS services at a massive scale. We furthered for 3rd SQL Server on Linux and containers, such as Docker Kubernetes. For the first time and helped capture the lost developer mindset with tools for developers, that ran on Linux and Mac OS and Windows, Visual Studio code, Azure Data Studio, and all this first time in history; we’ve never done it before. 

I had a lot of fun building platform PaaS services and wanted to learn more about SaaS. So, in 2017, I switched domains. I went to Salesforce to learn about the CRM space. At Salesforce, I led a team that essentially built a secure, compliant, massively scalable, distributed data platform for one of their clients called Marketing Cloud. 

And my team was responsible for managing the lifecycle of all customer data from birth to archival. So I owned critical platform capabilities for data security, GDPR, data compliance, performance, and customer-facing features like audit trails and data encryption using customer-provided keys. 

A couple of years ago, in 2021, I switched domains again, and this time I want to go and learn about fintech. So, I joined a company called MSCI to lead their data and A.I. Team. Here my team builds again data, platforms, and A.I. models to help our clients unlock the next wave of innovations in global investment decision-making across index analytics, environmental and social governance, and climate-related. 

At Microsoft, I learned a great deal about security. 

One of my top five project lists, called “Always Encrypted” that we built in the Azure SQL Database. In 2016, our customers didn’t trust the cloud with their data. 

Customers wanted to ensure that their sensitive data, like credit card numbers or Social Security numbers, was always encrypted at rest and in transit. That was the reason why we built that Always-Encrypted feature to meet this unmet demand. 

Essentially it was three big goals. 

  • We wanted to ensure that data stays encrypted, at rest, and in transit. 
  • Protect data from man-in-the-middle attacks. 
  • Prevent data access to unauthorized users, even highly privileged users. 

What always encrypted does is it allows clients to encrypt sensitive data inside their applications and never reveal the encryption keys to their database engine. Everything happens client side. From a client perspective, it provides a clear separation between those who own and can view the data.  

For example, app developers and end users manage the data but should not have access. Essentially always encrypted made sure that your data is protected even if somebody steals your backup. 

We built new capabilities in the driver to do three things. 

  1. Automatically encrypt data in sensitive columns on the client side before sending it to the database engine. 
  2. Automatically decrypt the data in sensitive clients when it is returned in the query. So, then you do a select, get back columns, and decrypt on the fly.
  3. You would automatically re-write queries on the client side so that the client semantics of the query were present. 

Once configured always encrypted, making it completely transparent for application writers. And if there was ever a man-in-the-middle attack, the data on the wire stays encrypted. The results of this were significant because this changed customer perception about putting data in the cloud. 

There’s one more project called Dynamic Data Masking. 

The premise is that data is stored in the database, and then various forms and user interfaces are built to view that data. 

So essentially, you’ll walk up to the user interface and log in as Sanjay; you want to be able to replace, let’s say, the Social Security number stored in the form with ****.  Vs. If you log in as Venky, you don’t want to ****. 

To enable this, we built a capability called dynamic data masking in the Azure SQL database. You could walk up to the database, pick a column, and mask the data in the column. 

We had standard patterns for social security numbers, credit cards and a bunch of national IDs. But you could specify your own pattern, which the server would then enforce. 

And by the time it comes to the client, the data is already masked, and it’s, again, based on RBAC. So, depending on who logs in and who has the option to see that, you either see the data or do not see the data. 

So typical scenario for this was, let’s say, Venky entering some data. Sanjay peeks over Venky’s shoulder to see credit card numbers. But guess what? That’s not possible anymore. 

One of the key principles that I’d like to consider here is that building secure software does not happen by accident. It requires a mindset change across people, processes, and technology. And the desired outcome is to make security a shared responsibility for everyone building the software. 

After all, we are doing this to sell the software to a customer. And you want to win the trust of hearts, minds, and customers. 

Here’s a subset of what I’ve used in the past. 

Validate all inputs – The golden rule is to assume all inputs are untrusted; I trust no one, which means it forces you to get into the mindset of doing proper input validation, which does help eliminate most software vulnerabilities, not all, but most. 

Principle of least privilege – So, every process should execute with the minimum privileges necessary to complete the job. Any elevated permission is granted only by exception. And if you do give it by exception, then you have to audit the actions performed, like who gathered that exception? And it should be time-bound. I won’t grant things like forever, so think just-in-time access. 

Deny access to data and processes by default – I’ve used techniques such as role-based access control to grant/deny access; who can do what? When and for how long. 

Defense in depth. And the idea here is to manage security risk with multiple defensive strategies. So, you essentially limit the blast radius. 

Let me tell you some of the trickier ones in my experience now, and this goes for your shift left thing. 

  1. Be intentional about designing for security upfront – What would it look like if you identified and defined security requirements early in the development cycle? You can’t evaluate the security posture of a system when security requirements are not defined and found later. It’s too late now.
  2. Notion of DevSecOps – The shift left is finding vulnerabilities earlier. And DevSecOps finds vulnerabilities continuously, so there is a way to marry the two. A few other similar secular software lifecycle processes can be followed.
  3. Notion of integrating security testing at every stage of the software development process. For example, static and dynamic code analysis at design time, first penetration testing in UAT
  4. Change management – what source code changes were made, code reviews, CI/CD every time you make a change, CI/CD pipeline that kicks in ensures your vulnerability test runs on 
  5. Compliance management – are you handling data correctly within your app? Are you encrypting it correctly if it is potentially customer or PII data? Is one customer able to see another customer’s data? What if there is a bug in the system?
  6. Threat modeling.  – If I draw a block diagram of my architecture, do I know what data flows from components A and B to component C and how it’s morphed? What does it look like? What are the security boundaries? Is there an external process that they are calling?
  7. Security training – We are not born with these skills. So, we all need training, and the thing is, we also need up-to-date training because the bad actors are also becoming smarter.
  8. Regulatory and regulatory considerations   – an organization produces a report that goes to some regulatory committee, and so on. 

As I said, there are some obvious ones and some not-so-obvious ones. And in the not-so-obvious ones, a lot more cultural mindset changes than technology. Everybody’s on a different point in their journey to adopt DevSecOps, and it’s different for different applications. Remember the right do the right thing.

I want to separate the notion of monitoring cloud services and applications for reliability, availability, performance, and security Vs. and monitoring access to customer data, including PII data. 

In my experience, you don’t need access to the customer’s PII data to monitor the application’s reliability, availability and performance, and security. 

But then and now, let’s come to the data part. Like any other cloud service, the ones I’ve helped build over the years collect operational telemetry logs, audit trails, etc., to help us monitor and operate the service and mitigate service disruptions. 

However, all PII data was removed from those data stores. We never stored PII data in our logs. And in audit trails and operational telemetry, you don’t need that to run the service if you operate the service or monitor or alert.  

Now, the customer data collected by the app or servers, including PII data, was also stored in a multi-tenant architecture, and access to these data stores occurred via a control plane and a data plane. 

There were two concepts: 

  • The control plan to access and orchestrate the computing and storage. 
  • The data plan to act – RBAC. 

We heavily used RBAC. We could not access the customer’s data as a cloud service operator. Only the customer did. 

Yes, I’ve had some situations where we needed to access the customer’s data to debug a specific customer-reported problem. For example, this particular bug occurs only with that piece of data that belongs to the customer. 

Here are some examples of the overall approach I’ve used. 

Provide training to employees before they access any customer data, including PII, for example, how to request and handle customer data, how to store customer data, how to transmit, share, and encrypt customer data, how and when to delete the customer data when you’re done with it.   

Ensure customer transparency and communicate with them. You must ask them before touching the customer’s data. Get explicit permission from your legal compliance teams and customers before accessing their data. Now tell the customer upfront what data will be accessed, for what purpose, and for how long. And then, when you’re done, provide evidence to the customer that the data was deleted after the problem was resolved. Of course, be prepared for the customer to say no, you can’t access my data. Because you have to find another way to mitigate the problem. 

This notion of a just-in-time process to access customer data with complete auditing of actions performed for regulatory and compliance. The point here is simple as cloud service operators or PaaS service creators. We are out to solve a customer problem. We must check with the customer, “Can I use your data because only your data can help me resolve this problem. So help me help you”. The mindset is different. It’s not like you go secretly do a select * from. Sometimes it’s all about transparency and sharing your intent clearly up front that comes down to that one simple rule. In my experience, customers love this level of transparency because this is new to them. 

Venky – Because the application is perfect, that is patching itself. But the reality is that it’s never perfect. Many moving parts exist, and a WAF becomes a lifesaver to meet compliance needs like a virtual patch. My analogy is that it’s like a sealant on a flat tire. The right place to fix it is you go and fix it and get the puncture removed, but you still need a sealant to go the place. It also needs to be a layer of defense to understand hacker-related analytics. It’s not just about vulnerability or hacker intent; where are they coming from? 

In summary, they do. WAF and WAP are necessary but not sufficient. I’ll tell you why. 

Consider that your home has a door, and it’s protected by one lock. All the bad actor needs to do is break that one lock and enter your home. In contrast, this notion of defense in depth says I can leverage multiple security measures to protect you. 

For example, you can install two locks and a security system that turns on a nobody’s home. You could leave my dog Copper at home, that he’ll bark. You could do multiple things, but there are different things. 

The goal is the same to prevent the thief from coming in. So, the thinking is that if one line of defense is compromised, additional layers exist as a backup to ensure that the threats are stopped. 

And that’s why I said WAF and WAAP are necessary but insufficient. 

WAF, defined by the PCI Security Standards Council, says it’s a security policy enforcement point between a Web application and a client endpoint. There will always be a need for that. You need something in the middle. In contrast, when you talk about WAAP, it does that, plus APIs.  

Not all have the people, processes, technology, and skills to protect against attacks from bad actors. Not all of them will know SQL injection. Not all of them will know cross-site scripting, Bot attacks, or DDoS. And I think that’s the sweet spot for things like WAF and WAAP. 

WAF and WAAP tools are other layers of defense. 

And so consider this today: WAAP and WAF have become a commodity. The second observation there is that as organizations take their time to embrace DevSecOps and shift left on whatnot, there will be continued instances of zero-day exploits that will need immediate mitigation that’s not going away. That’s still there.   

So then the question is, if you look at the WAF and WAAP space, how do those solutions differentiate going forward? 

  • You will mitigate service disruptions and increase my efficiency. 
  • You will increase the customer’s overall security posture for business needs.   

But now generative A.I. and large language modules are here. And we’ve already seen examples of malicious hackers using ChatGPT to generate malware; you know what they do. 

Considering phishing or social engineering attacks now, ChatGPT is certainly better than humans at creating those kinds of data. With spam mail or phishing email, you got no more writing, spelling mistakes, or grammar. They look good. 

Given all these new attack vectors, I am very interested in the next set of innovations in the WAF / WAAP space. And differentiation comes from there. 

Two big points. 

  1. WAF and WAAP are necessary but not sufficient. You still need an overall security posture. 
  2. The bad actors are not asleep. They are getting smarter every day.   

So then there is a whole notion of evolution. That’s where I see the role of WAF and WAAP. 

Short answer yes. But you got to realize this is all still hot off the present. It’s still too early. I’m sure you folks will do some wonderful things with it. But the point is that it’s already here, so you can’t ignore it. I mean, some customers, somebody will ask, hey, what are you going to do about this thing? That was my Uber point. 

I leave you with three guiding principles relevant to customers’ app and data security for consideration. 

    1. Principle one is around the customer scenario. The what and why come first, and the technology in the how follows. 

To paraphrase this, an Amazon leadership principle says to start with the customer and then work backward. 

I’ve seen teams start with the technology first, with disastrous consequences for customers. In my experience, they build this whole adage, which will come as a myth. I’ve had my fair share of good ideas that died a horrible death when they met the customer for the first time.   

    1. The number two is that all data that your app and service collect from the customer belongs to the customer, not you. That’s what GDPR says, not your data; it’s theirs. Your app is the data processor; the customer is the data controller. I think of your app or service as a bank. It’s a bank for the customer’s data. Your app and service use that data to provide customers with a great experience and quality of service. The data is the customer’s data, not yours, which means you have a customer promise to safeguard that data. They have the right to withdraw that data and go away. 

If you don’t have the right controls on their data or if they don’t see value in your service, change the conversation. 

  1. The last principle is that building secure software doesn’t happen accidentally. The mindset of securing your customer’s data to win their trust is paramount here. Teams that consider security an afterthought essentially experience service disruptions due to security vulnerabilities. 

This requires all hard work to mitigate the problem, which is not so sure we will go and mitigate, but I mean, such incidents and a number of them erode customer trust and result in a loss of business. 

And more importantly, it tarnishes the reputation of your brand. That’s what is at stake. And once your brand reputation is gone, it’s gone.